“Unexplainable” core dump

前端 未结 2 1592
囚心锁ツ
囚心锁ツ 2021-02-01 16:51

I\'ve seen many core dumps in my life, but this one has me stumped.

Context:

  • multi-threaded Linux/x86_64 program running on a cluster of AMD Barcelona CPUs
2条回答
  •  后悔当初
    2021-02-01 17:40

    So, unlikely as it may seem, we appear to have hit an actual bona-fide CPU bug.

    http://support.amd.com/us/Processor_TechDocs/41322_10h_Rev_Gd.pdf has erratum #721:

    721 Processor May Incorrectly Update Stack Pointer

    Description

    Under a highly specific and detailed set of internal timing conditions,
    the processor may incorrectly update the stack pointer after a long series
    of push and/or near-call instructions, or a long series of pop 
    and/or near-return instructions. The processor must be in 64-bit mode for
    this erratum to occur.
    

    Potential Effect on System

    The stack pointer value jumps by a value of approximately 1024, either in
    the positive or negative direction.
    This incorrect stack pointer causes unpredictable program or system behavior,
    usually observed as a program exception or crash (for example, a #GP or #UD).
    

提交回复
热议问题