How does the indexing of the Ice Lake's 48KiB L1 data cache work?
问题 The Intel manual optimization (revision September 2019) shows a 48 KiB 8-way associative L1 data cache for the Ice Lake microarchitecture. 1 Software-visible latency/bandwidth will vary depending on access patterns and other factors. This baffled me because: There are 96 sets (48 KiB / 64 / 8), which is not a power of two. The indexing bits of a set and the indexing bits of the byte offset add to more than 12 bits, this makes the cheap-PIPT-as-VIPT-trick not available for 4KiB pages. All in