Why Your Store-Buffer-Ordered Code Is Lying to You
You think your multi-threaded code is correct because it works on your x86 laptop. Spoiler: your CPU has been quietly buffering your stores and serving you lies. Here's what actually happens at the hardware level—and why your "correct" code breaks on ARM.