Do multi-core CPUs share the MMU and page tables?

后端 未结 7 557
挽巷
挽巷 2020-12-13 10:19

On a single core computer, one thread is executing at a time. On each context switch the scheduler checks if the new thread to schedule is in the same process than the previ

相关标签:
7条回答
  • 2020-12-13 10:58

    Take a look at this scheme. This is high level view of all that there is in a single core on a Corei7 cpu. The picture has been taken from Computer Systems: A Programmer's Perspective, Bryant and Hallaron. You can have access to diagrams in here, section 9.21.

    Computer Systems: A Programmer's Perspective, 2/E (CS:APP2e)Randal E. Bryant and David R. O'Hallaron, Carnegie Mellon University

    0 讨论(0)
  • 2020-12-13 11:08

    Answers here so far seem to be unaware of the existence of the Translation Lookaside Buffer (TLB), which is the MMU's way of converting the virtual addresses used by a process to a physical memory address.

    Note that these days the TLB itself is a complicated beast with multiple levels of caching. Just like a CPU's regular RAM caches (L1-L3), you wouldn't necessarily expect it's state at any given instant to contain info exclusively about the currently running process but for that to be moved in piecemeal on demand; see the Context Switch section of the wikipedia page.

    On SMP, all processors' TLBs need to keep a consistent view of the system page table. See e.g this section of the linux kernel book for one way of handling it.

    0 讨论(0)
  • 2020-12-13 11:09

    In ARMv8, Table base address register have CnP bit to support shard TLB in the inner shareable domain: enter image description here

    0 讨论(0)
  • 2020-12-13 11:10

    On the question of MMUs per processor there may be several. The assumption is that each MMU will add additional memory bandwidth. If DDR3-12800 memory allows 1600 mega-transfers per second on a processor with one MMU then one with four will theoretically allow 6400. Securing the bandwidth to the cores available is probably quite a feat. The bandwidth advertised will be whittled away quite a bit in the process.

    The number of MMUs on a processor is independent of the number of cores on it. The obvious examples are the 16 core CPUs from AMD, they definitely don't have 16 MMUs. A dual-core processor, on the other hand, might have two MMUs. Or just one. Or three?

    Edit

    Maybe I'm confusing MMUs with channels?

    0 讨论(0)
  • 2020-12-13 11:11

    AFAIK there is a single MMU per physical processor, at least in SMP systems, so all cores share a single MMU.

    In NUMA systems each core has a separate MMU, because each core has its own private memory.

    0 讨论(0)
  • 2020-12-13 11:17

    Sorry for previous answer. Deleted the answer.

    TI PandaBoard runs on OMAP4430 Dual Cortex A9 Processor. It has one MMU per core. It has 2 MMU for 2 cores.

    http://forums.arm.com/index.php?/topic/15240-omap4430-panda-board-armcortex-a9-mp-core-mmu/

    The above thread provides the info.

    In Addition , some more information on ARM v7

    Each core has the following features:

    1. ARM v7 CPU at 600 MHz
    2. 32 KB of L1 instruction CACHE with parity check
    3. 32 KB of L1 data CACHE with parity check
    4. Embedded FPU for single and double data precision scalar floating-point operations
    5. Memory management unit (MMU)
    6. ARM, Thumb2 and Thumb2-EE instruction set support
    7. TrustZone© security extension
    8. Program Trace Macrocell and CoreSight© component for software debug
    9. JTAG interface
    10. AMBA© 3 AXI 64-bit interface
    11. 32-bit timer with 8-bit prescaler
    12. Internal watchdog (working also as timer)

    The dual core configuration is completed by a common set of components:

    1. Snoop control unit (SCU) to manage inter-process communication, cache-2-cache and system memory transfer, cache coherency
    2. Generic interrupt control (GIC) unit configured to support 128 independent interrupt sources with software configurable priority and routing between the two cores
    3. 64-bit global timer with 8-bit prescaler
    4. Asynchronous accelerator coherency port (ACP)
    5. Parity support to detect internal memory failures during runtime
    6. 512 KB of unified 8-way set associative L2 cache with support for parity check and ECC
    7. L2 Cache controller based on PL310 IP released by ARM
    8. Dual 64-bit AMBA 3 AXI interface with possible filtering on the second one to use a single port for DDR memory access

    Though all these are for ARM , it will provide general idea.

    0 讨论(0)
提交回复
热议问题