What ensures reads/writes of operands occurs at desired timed with extended ASM?

后端 未结 2 1604
一个人的身影
一个人的身影 2020-12-19 23:56

According to GCC\'s Extended ASM and Assembler Template, to keep instructions consecutive, they must be in the same ASM block. I\'m having trouble understanding what provide

相关标签:
2条回答
  • 2020-12-20 00:16

    In your question you present some code that does a push and pop of ebx. The idea of saving ebx in the event that you compile with gcc using -fPIC (position independent code) is correct. It is up to our function not to clobber ebx upon return in that situation. Unfortunately the way you have defined the constraints you explicitly use ebx. Generally the compiler will warn you (error: inconsistent operand constraints in an 'asm') if you are using PIC code and you specify =b as an output constraint. Why it doesn't produce a warning for you is unusual.

    To get around this problem you can let the assembler template choose a register for you. Instead of pushing and popping we simply exchange %ebx with an unused register chosen by the compiler and restore it by exchanging it back after. Since we don't wish to have the compiler clobber our input registers during the exchange we specify early clobber modifier, thus ending up with a constraint of =&r (instead of =b in the OPs code). More on modifiers can be found here. Your code (for 32 bit) would look something like:

    unsigned int __FUNC = 1, __SUBFUNC = 0;
    unsigned int __EAX, __EBX, __ECX, __EDX;
    
    __asm__ __volatile__ (
           "xchgl\t%%ebx, %k1\n\t"      \
           "cpuid\n\t"                  \
           "xchgl\t%%ebx, %k1\n\t"
    
      : "=a"(__EAX), "=&r"(__EBX), "=c"(__ECX), "=d"(__EDX)
      : "a"(__FUNC), "c"(__SUBFUNC));
    

    If you intend to compile for X86_64 (64 bit) you'll need to save the entire contents of %rbx. The code above will not quite work. You'd have to use something like:

    uint32_t  __FUNC = 1, __SUBFUNC = 0;
    uint32_t __EAX, __ECX, __EDX;
    uint64_t __BX; /* Big enough to hold a 64 bit value */
    
    __asm__ __volatile__ (
           "xchgq\t%%rbx, %q1\n\t"      \
           "cpuid\n\t"                  \
           "xchgq\t%%rbx, %q1\n\t"
    
      : "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
      : "a"(__FUNC), "c"(__SUBFUNC));
    

    You could code this up using conditional compilation to deal with both X86_64 and i386:

    uint32_t  __FUNC = 1, __SUBFUNC = 0;
    uint32_t __EAX, __ECX, __EDX;
    uint64_t __BX; /* Big enough to hold a 64 bit value */
    
    #if defined(__i386__)
        __asm__ __volatile__ (
               "xchgl\t%%ebx, %k1\n\t"      \
               "cpuid\n\t"                  \
               "xchgl\t%%ebx, %k1\n\t"
    
          : "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
          : "a"(__FUNC), "c"(__SUBFUNC));
    
    #elif defined(__x86_64__)
        __asm__ __volatile__ (
               "xchgq\t%%rbx, %q1\n\t"      \
               "cpuid\n\t"                  \
               "xchgq\t%%rbx, %q1\n\t"
    
          : "=a"(__EAX), "=&r"(__BX), "=c"(__ECX), "=d"(__EDX)
          : "a"(__FUNC), "c"(__SUBFUNC));
    #else
    #error "Unknown architecture."
    #endif
    

    GCC has a __cpuid macro defined in cpuid.h. It defined the macro so that it only saves the ebx and rbx register when required. You can find the GCC 4.8.1 macro definition here to get an idea of how they handle cpuid in cpuid.h.

    The astute reader may ask the question - what stops the compiler from choosing ebx or rbx as the scratch register to use for the exchange. The compiler knows about ebx and rbx in the context of PIC, and will not allow it to be used as a scratch register. This is based on my personal observations over the years and reviewing the assembler (.s) files generated from C code. I can't say for certain how more ancient versions of gcc handled it so it could be a problem.

    0 讨论(0)
  • 2020-12-20 00:17

    I think you understand, but to be clear, the "consecutive" rule means that this:

    asm ("a");
    asm ("b");
    asm ("c");
    

    ... might get other instructions interposed, so if that's not desirable then it must be rewritten like this:

    asm ("a\n"
         "b\n"
         "c");
    

    ... and now it will be inserted as a whole.


    As for the cpuid snippet, we have two problems:

    1. The cpuid instruction will overwrite ebx, and hence clobber the data that PIC code must keep there.

    2. We want to extract the value that cpuid places in ebx while never returning to compiled code with the "wrong" ebx value.

    One possible solution would be this:

    unsigned int __FUNC = 1, __SUBFUNC = 0;
    unsigned int __EAX, __EBX, __ECX, __EDX;
    
    __asm__ __volatile__ (    
      "push %ebx;"
      "cpuid;"
      "mov %ebx, %ecx"
      "pop %ebx"
      : "=c"(__EBX)
      : "a"(__FUNC), "c"(__SUBFUNC)
      : "eax", "edx"
    );
    __asm__ __volatile__ (    
      "push %ebx;"
      "cpuid;"
      "pop %ebx"
      : "=a"(__EAX), "=c"(__ECX), "=d"(__EDX)
      : "a"(__FUNC), "c"(__SUBFUNC)
    );
    

    There's no need to mark ebx as clobbered as you're putting it back how you found it.

    (I don't do much Intel programming, so I may have some of the assembler-specific details off there, but this is how asm works.)

    0 讨论(0)
提交回复
热议问题