State of “memset” functionality in C++ with modern compilers

前端 未结 12 1744
我寻月下人不归
我寻月下人不归 2021-02-12 16:03

Context:

A while ago, I stumbled upon this 2001 DDJ article by Alexandrescu: http://www.ddj.com/cpp/184403799

It\'s about comparing various ways to initialize

12条回答
  •  再見小時候
    2021-02-12 16:16

    Memset/memcpy are mostly written with a basic instruction set in mind, and so can be outperformed by specialized SSE routines, which on the other hand enforce certain alignment constraints.

    But to reduce it to a list :

    1. For data-sets <= several hundred kilobytes memcpy/memset perform faster than anything you could mock up.
    2. For data-sets > megabytes use a combination of memcpy/memset to get the alignment and then use your own SSE optimized routines/fallback to optimized routines from Intel etc.
    3. Enforce the alignment at the start up and use your own SSE-routines.

    This list only comes into play for things where you need the performance. Too small/or once initialized data-sets are not worth the hassle.

    Here is an implementation of memcpy from AMD, I can't find the article which described the concept behind the code.

提交回复
热议问题