Fully optimized memcpy/memmove for Core 2 or Core i7 architecture?

后端 未结 3 945
無奈伤痛
無奈伤痛 2021-01-02 06:05

The theoretical maximum of memory bandwidth for a Core 2 processor with DDR3 dual channel memory is impressive: According to the Wikipedia article on the architecture, 10+

3条回答
  •  生来不讨喜
    2021-01-02 07:00

    When measuring bandwidth did you take into account memcpy was both a read and a write, so 3 GB/s of memory copied is actually 6 GB/s of bandwidth?

    Remember, the bandwidth is theoretical maximum - real world use will be much lower. For instance, one page fault and your bandwidth will drop to MB/s.

    memcpy/memmove are compiler intrinsics and will usually be inlined to rep movsd (or the appropriate SSE instructions if your compiler can target that). It may be impossible to improve the codegen over this, since modern CPU's will handle rep instructions like this very, very well.

提交回复
热议问题