Why CPU cache mappings matter sometimes
On the past few days this tweet caused some confusion and interest on how CPU caches work. The tweet asks which of the following options is faster: A. for (int i = 0; i < n; i += 256) a[i]++; B. for (int i = 0; i < n; i += 257) a[i]++; The question is taken verbatim from Algorithmica, but not given credit, not cool. Algorithmica also provides a very good explanation, although in my opinion a little convoluted, and misses an important part, so I will try to simplify it. ...