Memory Consistency Models: A Tutorial

57 points by tanelpoder 5 months ago

klabb3 5 months ago

As someone with a decent level of familiarity, this post was good (even taught me a couple new things). Unfortunately this topic is so difficult that it can only be briefly introduced in a medium sized blog post. Mastering fences, compiler reordering, on-chip reordering, cache coherence protocols is like multiple PhD level if not more. And even that’s not enough to fully master even a mutex (you still need thread yielding/parking). And even then correctly implemented mutices are notorious foot-machine guns.

My high level take is we mostly got concurrency wrong for imperative languages (probably because they were developed before parallel execution and all these optimizations was a thing). Exposing shared mutable memory access to application developers should have been a no-go from the start.

So, even if parallelism is a Wild West, some form of concurrency is a must-have, and ironically the language that caused the least amount of pain was JS, because they chose to keep business logic single threaded. And even for the perf issues with JS, you rarely see the lack of parallel business logic mentioned as a bottleneck. And web workers (the escape hatch), are quite uncommon in practice, which imo validates that the tradeoff was worth its weight in gold.

mplanchard 5 months ago

I have always found this hard to reason about. This was a nice primer! I also like the rustonomicon’s treatment of the subject: https://doc.rust-lang.org/nomicon/atomics.html

crvdgc 5 months ago

Here's a tool suite to simulate and run memory model litmus tests (on real hardware): https://github.com/herd/herdtools7

The simulation tool can also generate relation graphs similar to those of the blog.

SomeHacker44 5 months ago

While this was a fine article, it was vastly too limited. I would have preferred a much more in depth discussion of common CPU memory models and programming language models, and how they interact, and how programmers can build a mental model of what is going on. Unlike another commenter, I learned nothing (except maybe an article link for a first year CS or CE student, or even a high schooler or precocious pre-teen), which was the only reason I read the article!

j_seigh 5 months ago

AFAIK, the relaxed memory models are mostly from processors using pipelined execution and out of order execution. Cache has little to do with it as most modern processors have transparent cache, meaning you can't tell it there except for performance effects. Differences in memory accesses due to cache state might exacerbate some race conditions but those race conditions would still be there, cache or no cache.