making mistake in inline assembler in gcc [duplicate]

时光怂恿深爱的人放手 提交于 2019-12-19 08:17:23

问题


I have successfully written some inline assembler in gcc to rotate right one bit following some nice instructions: http://www.cs.dartmouth.edu/~sergey/cs108/2009/gcc-inline-asm.pdf

Here's an example:

static inline int ror(int v) {
    asm ("ror %0;" :"=r"(v) /* output */ :"0"(v) /* input */ );
    return v;
}

However, I want code to count clock cycles, and have seen some in the wrong (probably microsoft) format. I don't know how to do these things in gcc. Any help?

unsigned __int64 inline GetRDTSC() {
   __asm {
      ; Flush the pipeline
      XOR eax, eax
      CPUID
      ; Get RDTSC counter in edx:eax
      RDTSC
   }
}

I tried:

static inline unsigned long long getClocks() {
    asm("xor %%eax, %%eax" );
    asm(CPUID);
    asm(RDTSC : : %%edx %%eax); //Get RDTSC counter in edx:eax

but I don't know how to get the edx:eax pair to return as 64 bits cleanly, and don't know how to really flush the pipeline.

Also, the best source code I found was at: http://www.strchr.com/performance_measurements_with_rdtsc

and that was mentioning pentium, so if there are different ways of doing it on different intel/AMD variants, please let me know. I would prefer something that works on all x86 platforms, even if it's a bit ugly, to a range of solutions for each variant, but I wouldn't mind knowing about it.


回答1:


The following does what you want:

inline unsigned long long rdtsc() {
  unsigned int lo, hi;
  asm volatile (
     "cpuid \n"
     "rdtsc" 
   : "=a"(lo), "=d"(hi) /* outputs */
   : "a"(0)             /* inputs */
   : "%ebx", "%ecx");     /* clobbers*/
  return ((unsigned long long)lo) | (((unsigned long long)hi) << 32);
}

It is important to put as little inline ASM as possible in your code, because it prevents the compiler from doing any optimizations. That's why I've done the shift and oring of the result in C code rather than coding that in ASM as well. Similarly, I use the "a" input of 0 to let the compiler decide when and how to zero out eax. It could be that some other code in your program already zeroed it out, and the compiler could save an instruction if it knows that.

Also, the "clobbers" above are very important. CPUID overwrites everything in eax, ebx, ecx, and edx. You need to tell the compiler that you're changing these registers so that it knows not to keep anything important there. You don't have to list eax and edx because you're using them as outputs. If you don't list the clobbers, there's a serious chance your program will crash and you will find it extremely difficult to track down the issue.




回答2:


This will store the result in value. Combining the results takes extra cycles, so the number of cycles between calls to this code will be a few less than the difference in results.

unsigned int hi,lo;
unsigned long long value;
asm (
    "cpuid\n\t"
    "rdtsc"
    : "d" (hi), "a" (lo)
);
value = (((unsigned long long)hi) << 32) | lo;


来源:https://stackoverflow.com/questions/4473452/making-mistake-in-inline-assembler-in-gcc

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!