Is the TLB shared between multiple cores?

最后都变了- 提交于 2019-11-30 14:59:14

The TLB caches the translations listed in the page table. Each CPU core can be running in a different context, with different page tables. This is what you'd call the MMU, if it was a separate "unit", so each core has its own MMU. Any shared caches are always physically-indexed / physically tagged, so they cache based on post-MMU physical address.

The TLB is a cache, so technically it's just an implementation detail that could vary by microarchitecture (between different implementations of the x86 architecture).

In practice, all that really varies is the size. 2-level TLBs are common now, to keep full TLB misses to a minimum but still be fast enough allow 3 translations per clock cycle. The TLB's main goal is to be fast, not necessarily big, so a shared TLB between cores wouldn't be useful. Esp. given the overhead of making sure all the cores using it were running threads that shared the same page tables.

Even if all cores were running threads from the same process, some threads might be in kernel mode handling an interrupt or system call, and thus using the kernel's page tables. That makes a shared-across-cores TLB of very low value / harder to implement.

For an example of how the pieces fit together in a real CPU, see David Kanter's writeup of Intel's Sandybridge design. Note that the diagrams are for a single SnB core. The only shared-between-cores cache in most CPUs is the last-level cache. Intel's SnB-family designs all use a 2MiB-per-core modular L3 cache on a ring bus. So adding more cores adds more L3 to the total pool, as well as adding new cores (each with their own L2/L1D/L1I/uop-cache, and two-level TLB.)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!