Why am I getting zero from mov ax, bx+si+1?

后端 未结 2 1894
情歌与酒
情歌与酒 2021-01-21 22:34
    mov     ax,10
    mov     bx,4
    mov     si,ax
    mov     ax,bx+si+1
    LEA     ax,[bx+si+1]

When I add bx,si and 1 together and move to ax , t

2条回答
  •  野的像风
    2021-01-21 23:08

    Jose gave you detailed answer what is happening in your code, you certainly should try to replicate his results with your emu8086, so you are capable to debug your code alone.

    I want just to add a "high-level" answer.

    You probably sort of missed what assembler is. It's not regular programming language, but more like aliases for actual machine code of target CPU.

    That means that any instruction you write is almost always mapped 1:1 to the actual CPU instruction (some assemblers support so called "pseudo instruction", which are compiled into few real instructions by well defined rules - this is very uncommon on x86, and I'm not aware of such pseudo instruction in emu8086).

    And instruction mov register, some_math_expression doesn't exist, the source value must be single number (either immediate value encoded in the opcode, or other register, or value fetched from computer memory (your case)).

    The confusing part is, that x86 does support very complex memory addressing modes, in the 32b mode for example [eax+eax*4+imm32] is valid (imm32 = 32b immediate value encoded directly in the instruction, that's where the address "count" is stored, but in 16b mode it's only 16b number). In 16b mode the legal combinations of values and registers for addressing are very limited, but they still may look as mathematical expression on first sight.

    But they are not. It's hard-coded in instruction opcode, so each of the legal combinations has it's own binary number, the CPU doesn't read that as "some bx register plus some di register plus some number", it reads the second byte value of opcode (in binary it will be 10000001 I think, too lazy to verify), and the transistors on the chip are designed for that value to work as [bx+di+displacement] addressing mode.

    For example for the inc byte ptr [bx+di+imm16] (mov is simpler, just loads the value):

    CPU reads the instruction opcode 0xFF = inc, then it reads the addressing mode 0x81 (I think, didn't verify), so it knows it's "[bx+di+disp]", then it reads another two more bytes following, to get the 16b of displacement value, and finally adds those values together to get 16b offset into memory.

    Then it takes the segment value (ds by default for this instruction, unless segment prefix opcode was used to override it for following single instruction) and adds it to the offset to produce the 20b physical address into memory chip (see any documentation of real mode addressing to learn how segment+offset are combined and why I'm talking about 20b and not 32b).

    This 20b value is then set on the pins of CPU connected to memory chip (by "bus" = lot of "wires" or paths on PCB, on old 8086: 20 for addressing, 16 for data and some more for read/write/status handling) and memory chip is instructed to load value from the address which is read on the bus (I'm talking about simple "8086" computer, ignoring all the heavy caching mechanisms being present in modern PC, where CPU doesn't talk to memory chips directly any more).

    After some time the memory chip will manage to switch it's internal state in such way, that the pins on the "data" side of bus are set like the value in memory cell at that address, so CPU can now "read" those pins and store that into temporary unnamed register. Then it runs the increment process over it, and sets the data bus pins to the new value (address part of bus very likely holds the same address whole time), and instructs memory chip to write the value.

    And that's how the value in memory at address "bx+di+displacement" turns (for example) from 6 to 7.

    ... where I was...

    Oh, so it's not free mathematical expression, but one of the legal addressing modes, hard-wired in the CPU.

    emu8086 unfortunately doesn't yell at you for the missing [], instead it will silently produce the only possible instruction which looks similar to what you wrote.

    BTW, this should make it easy for you to understand the LEA instruction and why the original syntax is using [], so it looks like it will access memory.

    lea is the mov instruction with removed guts, so after the address calculation of the memory cell to "mov" it stops, doesn't contact memory chip at all, but fills the destination register with the 16b memory offset value which was calculated in the first phase of processing.

提交回复
热议问题