I\'m trying to compile a simple C program (Win7 32bit, Mingw32 Shell and GCC 5.3.0). The C code is like this:
#include <
As @MichealPetch says, you're approaching this the wrong way. If you're trying to set up an operand for lgdt
, do that in C and only use inline-asm for the lgdt
instruction itself. See the inline-assembly tag wiki, and the x86 tag wiki.
Related: a C struct/union for messing with Intel descriptor-tables: How to do computations with addresses at compile/linking time?. (The question wanted to generate the table as static data, hence asking about breaking addresses into low / high halves at compile time).
Also: Implementing GDT with basic kernel for some C + asm GDT manipulation. Or maybe not, since the answer there just says the code in the question is problematic, without a detailed fix.
Linker error setting loading GDT register with LGDT instruction using Inline assembly has an answer from Michael Petch, with some links to more guides/tutorials.
It's still useful to answer the specific question, even though the right fix is https://gcc.gnu.org/wiki/DontUseInlineAsm.
This compiles fine with optimization enabled.
With -O0
, gcc doesn't notice or take advantage of the fact that the operands are all small constant offsets from each other, and can use the same base register with an offset addressing mode. It wants to put a pointer to each input memory operand into a separate register, but runs out of registers. With -O1
or higher, CSE does what you'd expect.
You can see this in a reduced example with the last 3 memory operands commented, and changing the asm string to include an asm comment with all the operands. From gcc5.3 -O0 -m32 on the Godbolt compiler explorer:
#define _set_tssldt_desc(n,addr,type) \
__asm__ ("movw $104,%1\n\t" \
"#operands: %0, %1, %2, %3\n" \
...
void simple_wrapper(char *n, char *addr) {
set_tss_desc(n, addr);
}
pushl %ebp
movl %esp, %ebp
pushl %ebx
movl 8(%ebp), %eax
leal 2(%eax), %ecx
movl 8(%ebp), %eax
leal 4(%eax), %ebx
movl 12(%ebp), %eax
movl 8(%ebp), %edx
#APP # your inline-asm code
movw $104,(%edx)
#operands: %eax, (%edx), (%ecx), (%ebx)
#NO_APP
nop # no idea why the compiler inserted a literal NOP here (not .p2align)
popl %ebx
popl %ebp
ret
But with optimization enabled, you get
simple_wrapper:
movl 4(%esp), %edx
movl 8(%esp), %eax
#APP
movw $104,(%edx)
#operands: %eax, (%edx), 2(%edx), 4(%edx)
#NO_APP
ret
Notice how the later operands use base+disp addressing modes.
Your constraints are totally backwards. You're writing to memory that you've told the compiler is an input operand. It will assume that the memory is not modified by the asm
statement, so if you load from it in C, it might move that load ahead of the asm
. And other possible breakage.
If you had used "=m"
output operands, this code would be correct (but still inefficient compared to letting the compiler do it for you.)
You could have written your asm to do the offsetting itself from a single memory-input operand, but then you'd need to do something to tell the compiler about that the memory read by the asm
statement; e.g. "=m" (*(struct {char a; char x[];} *) n)
to tell it that you write the entire object starting at n
. (See this answer).
AT&T syntax x86 memory operands are always offsetable, so you can use 2 + %[nbase]
instead of a separate operand, if you do
asm("movw $104, %[nbase]\n\t"
"movw $123, 2 + %[nbase]\n\t"
: [nbase] "=m" (*(struct {char a; char x[];} *) n)
: [addr] "ri" (addr)
);
gas will warn about 2 + (%ebx)
or whatever it ends up being, but that's ok.
Using a separate memory output operand for each place you write will avoid any problems about telling the compiler which memory you write. But you got it wrong: you've told the compiler that your code doesn't use n+1
when in fact you're using movw $104
to store 2 bytes starting at n
. So that should be a uint16_t
memory operand. If this sounds complicated, https://gcc.gnu.org/wiki/DontUseInlineAsm. Like Michael said, do this part in C with a struct
, and only use inline asm for a single instruction that needs it.
It would obviously be more efficient to use fewer wider store instructions. IDK what you're planning to do next, but any adjacent constants should be coalesced into a 32-bit store, like mov $(104 + 0x1234 << 16), %[n0]
or something. Again, https://gcc.gnu.org/wiki/DontUseInlineAsm.