Assembly:
[BITS 16]
global _start
_start:
mov ax, 0x07C0
mov ds, ax
mov si, hw
call print_string
jmp $
print_string:
mov ah, 0x0E
.ch
I found a solution here: Looking for 16-bit x86 compiler
something I learned the hard way; -Ttext 0x0 is critical, otherwise the .text segment is pushed outside of 16bit addressing range (don't ask me why)
First of all I recommend you consider using an i686 ELF Cross compiler to avoid some gotchyas that can later bite you as you develop your kernel.
Nothing prevents you from using ELF as the object file type with NASM, but it is often simpler to use the -f bin
option that generates a fully resolved flat binary file that needs no fixups. It can be used as a boot sector without any linking step. The down side is that all the code has to be in the same. External assembler statement can be included with the %include
directive, similar to C's include
directive.
For this to work you have to place the origin point in the assembler file so that NASM knows what the base offset (origin point) is needed for generating absolute addresses (for labels etc). You would modify your assembly code and add this at the top:
[ORG 0x0000]
This only applies when using -f bin
output option, this directive will throw an error for other output types like -f elf
. In this case we use 0x0000 because the segment your code assumes is 0x07c0 which is moved into DS. 0x07c0:0x0000 maps to physical address (0x07c0<<4)+0x0000 = 0x07c00 which is where our bootloader will be loaded into memory.
If you don't specify [org 0x0000]
, then org = 0x0000 is the default when using the -f bin
output option, so it isn't actually necessary to specify it. It just makes it much clearer to a reader by using it explicitly.
In order assemble this into a binary file you could do:
nasm file.asm -fbin -o file.bin
This would output a flat binary file called file.bin
assembled from file.asm
.No linking step is needed.
In your example you are using ELF. There may be a couple reasons for doing it this way. Your generated binary file may be the combination of multiple object (.o
) files, or you may wish to generate debug symbols to be used with a debugger like GDB. Whatever your reason this can be done using these commands:
nasm file.asm -felf -o file.o
ld -melf_i386 -Ttext 0x0 -o file.bin file.o --oformat binary
-Ttext 0x0
would be the origin point that matches your code. 0x0000 in this case is the same value you would have used with the ORG
directive had you used NASM with the -f bin
output option. If you had written your code to assume an offset of 0x7c00 with code like:
xor ax, ax ; AX = 0
mov ds, ax ; DS = 0
Then the TEXT segment would have to be specified with:
ld -melf_i386 -Ttext 0x7c00 -o file.bin file.o --oformat binary
Your question may be: why do we need to explicitly set a value for the base of the TEXT segment? The reason is that the the default for LD is dependent on the the OS you are targeting (usually for the platform you are currently running on). If you are on Linux, by default LD will attempt to create output for Linux. On Linux the default for the start of the TEXT segment is usually 0x08048000
when specifying -m elf_i386
. This is of course a 32-bit value.
Any place an absolute address was needed it would attempt to add 0x08048000
(or potentially some other large address) to it. So an instruction like this:
mov si, hw
Would attempt to move the address of hw
into the 16-bit register SI. The linker would have attempted to resolve this to 0x08048000 + offset of hw
when creating the flat binary output file. Because you have a 32-bit value being used in an instruction that only takes a 16-bit value, you will get a warning/error. LD will truncate the 32-bit value to 16-bit, unfortunately that would likely produce an incorrect 16-bit address.