Why am I getting zero from mov ax, bx+si+1?

后端 未结 2 1895
情歌与酒
情歌与酒 2021-01-21 22:34
    mov     ax,10
    mov     bx,4
    mov     si,ax
    mov     ax,bx+si+1
    LEA     ax,[bx+si+1]

When I add bx,si and 1 together and move to ax , t

相关标签:
2条回答
  • 2021-01-21 23:08

    Jose gave you detailed answer what is happening in your code, you certainly should try to replicate his results with your emu8086, so you are capable to debug your code alone.

    I want just to add a "high-level" answer.

    You probably sort of missed what assembler is. It's not regular programming language, but more like aliases for actual machine code of target CPU.

    That means that any instruction you write is almost always mapped 1:1 to the actual CPU instruction (some assemblers support so called "pseudo instruction", which are compiled into few real instructions by well defined rules - this is very uncommon on x86, and I'm not aware of such pseudo instruction in emu8086).

    And instruction mov register, some_math_expression doesn't exist, the source value must be single number (either immediate value encoded in the opcode, or other register, or value fetched from computer memory (your case)).

    The confusing part is, that x86 does support very complex memory addressing modes, in the 32b mode for example [eax+eax*4+imm32] is valid (imm32 = 32b immediate value encoded directly in the instruction, that's where the address "count" is stored, but in 16b mode it's only 16b number). In 16b mode the legal combinations of values and registers for addressing are very limited, but they still may look as mathematical expression on first sight.

    But they are not. It's hard-coded in instruction opcode, so each of the legal combinations has it's own binary number, the CPU doesn't read that as "some bx register plus some di register plus some number", it reads the second byte value of opcode (in binary it will be 10000001 I think, too lazy to verify), and the transistors on the chip are designed for that value to work as [bx+di+displacement] addressing mode.

    For example for the inc byte ptr [bx+di+imm16] (mov is simpler, just loads the value):

    CPU reads the instruction opcode 0xFF = inc, then it reads the addressing mode 0x81 (I think, didn't verify), so it knows it's "[bx+di+disp]", then it reads another two more bytes following, to get the 16b of displacement value, and finally adds those values together to get 16b offset into memory.

    Then it takes the segment value (ds by default for this instruction, unless segment prefix opcode was used to override it for following single instruction) and adds it to the offset to produce the 20b physical address into memory chip (see any documentation of real mode addressing to learn how segment+offset are combined and why I'm talking about 20b and not 32b).

    This 20b value is then set on the pins of CPU connected to memory chip (by "bus" = lot of "wires" or paths on PCB, on old 8086: 20 for addressing, 16 for data and some more for read/write/status handling) and memory chip is instructed to load value from the address which is read on the bus (I'm talking about simple "8086" computer, ignoring all the heavy caching mechanisms being present in modern PC, where CPU doesn't talk to memory chips directly any more).

    After some time the memory chip will manage to switch it's internal state in such way, that the pins on the "data" side of bus are set like the value in memory cell at that address, so CPU can now "read" those pins and store that into temporary unnamed register. Then it runs the increment process over it, and sets the data bus pins to the new value (address part of bus very likely holds the same address whole time), and instructs memory chip to write the value.

    And that's how the value in memory at address "bx+di+displacement" turns (for example) from 6 to 7.

    ... where I was...

    Oh, so it's not free mathematical expression, but one of the legal addressing modes, hard-wired in the CPU.

    emu8086 unfortunately doesn't yell at you for the missing [], instead it will silently produce the only possible instruction which looks similar to what you wrote.

    BTW, this should make it easy for you to understand the LEA instruction and why the original syntax is using [], so it looks like it will access memory.

    lea is the mov instruction with removed guts, so after the address calculation of the memory cell to "mov" it stops, doesn't contact memory chip at all, but fills the destination register with the 16b memory offset value which was calculated in the first phase of processing.

    0 讨论(0)
  • 2021-01-21 23:17

    Your question is : "Why am I getting zero from mov ax, bx+si+1?". It's hard to give you an accurate answer because you forgot to tell what compiler you are using and your code snippet doesn't include the data segment so we can't see your data. What we can do is to test your code with some numbers in the data segment and see the results :

    .model small
    .stack 100h
    .data    
    xy db 0A0h,0A1h,0A2h,0A3h,0A4h,0A5h,0A6h,0A7h,0A8h,0A9h,0AAh,0ABh,0ACh,0ADh,0AEh,0AFh,0B0h
    .code
      mov  ax, @data
      mov  ds, ax
    
      mov  ax, 10
      mov  bx, 4
      mov  si, ax
      mov  ax, bx+si+1       ;◄■■ #1 (EXPLANATION BELOW ▼)
      LEA  ax, [bx+si+1]     ;◄■■ #2 (EXPLANATION BELOW ▼)
    

    Let's illustrate what happens here:

    This is what is going on:

    #1 Because of the presence of a base register (bx) and an index register (si) the sum is interpreted as a memory addressing, so the code gets the data in memory location 15. ax register size is 2 bytes, so the result is that ax gets 2 bytes starting at memory location 15, in our data segment those 2 bytes are 0AFh and 0B0h. al is the lower byte of ax, so the first byte (0AFh) stores there, the higher byte ah gets the second byte (0B0h), and this is how ax becomes 0B0AFh.

    #2 We said that the presence of the base register bx and the index register si is interpreted as a memory addressing, so [bx+si+1] points to memory location 15 (0AFh). Instruction lea stands for load effective address, its purpose is to get an address from inside the data segment. Your line of code is getting the effective address of memory location 15 (0AFh), which is 15.

    So much theory requires a demonstration, and here it is:

    Next is a screenshot from EMU8086 : the BLUE arrow points to the original line of code, the GREEN arrow points to the line of code as it is been interpreted (as a memory addressing), and the RED arrow shows the effect in register ax (B0AFh).

    Now the screenshot for the next instruction : the BLUE arrow points to the original line of code, the GREEN arrow points to the line of code as it is been interpreted (notice it's identical to the previous one), and the RED arrow shows the effect in register ax (0Fh).

    Finally, let's test the code in Visual Studio 2013 : next screenshot proves that mov ax, bx+si+1 is invalid, and the other line gives the same result as EMU8086 (ax=0FH) :

    So, "why are you getting zero from mov ax, bx+si+1?" Because you probably have a zero in memory location 15 inside your data segment. You maybe thought that bx+si+1 was going to give you a normal number, 15, but now you know that using base and index registers will be interpreted as memory addressing, so you don't get the number 15 but the data inside memory location 15.

    0 讨论(0)
提交回复
热议问题