问题
Why in below code we push code segment (PUSH CS) and then popping it into the data segment (POP DS)?
I am giving these lines explicitly as line1 and line2. Please let me know how MOVSW is working here.
IF HIGHMEMORY
PUSH DS
MOV BX, DS
ADD BX, 10H
MOV ES, BX
PUSH CS. ;line1
POP DS. ;line2
XOR SI, SI
MOV DI, SI
MOV CX, OFFSET SYSSIZE + 1
SHR CX, 1
REP MOVSW. ;line3
POP DS
PUSH ES
MOV AX, OFFSET SECONDRELOCATION
PUSH AX
AAA PROC FAR
RET
AAA ENDP
SECONDRELOCATION:
more code here..............
回答1:
Temporarily setting DS = CS and then restoring it looks like an inefficient alternative to using a CS override prefix on rep movsw
.
A segment override can change the source for movsw from DS:SI
to CS:SI
. (The destination of ES:DI
can't be overriden).
(update: on original 8086/8088, there was a hardware "bug" / anomaly: on resuming from an interrupt that happened during a REP-string instruction, IP would point to the last prefix of an instruction, not the first. So depending on the encoding, cs rep movsw
would either decode as rep movsw
or cs movsw
. See @MichaelPetch's comments, and https://www.pcjs.org/pubs/pc/reference/intel/8086/ for more 8086 errata and anomalies that have been fixed in later x86 CPUs.)
This code is doing a memcpy(dst, code_segment, sizeof(code_segment))
, where the dst
segment:offset is (BX + 16):0
. The instructions before rep movsw
set up DS = BX+16 and set DI=0.
Then the code jumps to the new location, using a far ret
after pushing the destination segment (ES) and an offset within it. (push offset SECONDRELOCATION
would work, but only on 186+. This DOS code needs to maintain backwards compat with 8086, unfortunately.)
Apparently this assembler doesn't support syntax like ret far
or retf
, so they have to assemble a far ret
instruction by declaring a proc far
around the ret
instruction. AAA
is a very weird name for that proc, because aaa is also a valid x86 instruction mnemonic (ASCII Adjust after Addition).
So execution continues at the SECONDRELOCATION:
label in the copy of the code we just made.
(size+1) / 2
rounds up to a whole number of words, unless the size wraps in which case it copies zero bytes instead of 64k. (Unlike loop
, rep
checks the count before executing once.)
Doing the shr
at runtime is also dumb, and could have been done at assemble time using something like mov cx, (offset endcode - startcode + 1) / 2
. (You probably can't divide an offset
result by 2, but you can find the distance between two labels in the same section at assemble time.)
Anyway, probably the point is to relocate the code into HIGHMEM, leaving low memory free for use by programs that can't use HIMEM.
回答2:
The sequence push cs
, pop ds
is simply a way to set your data segment to the same value as your code segment.
It's similar to using push ax
, pop bx
instead of mov bx, ax
, other than the fact that it uses memory and may have a different effect on certain flags, something I couldn't be bothered checking when my intent is only to provide an example :-)
One reason you would do this dates back to the old days of x86 segmented architecture (as opposed to the more modern selectors), which is rarely used nowadays. The x86 had various memory models like, tiny, small, compact, medium, large and huge.
These were basically variations on the sizes and quantities of code and data segments that you could use and, from memory, tiny meant that you had one segment that contained both code and data.
Hence cs
and ds
should be set to the same value so that all instructions operated on that segment by default.
In your particular case, you're saving ds
, setting it to the same value as cs
then restoring it. See below for a more likely explanation of why.
As to the workings of movsw
, it simply copies a single word value from the memory at ds:si
to address es:di
, updating the pointers afterward (increment or decrement, depending on the setting of the direction flag).
The rep
prefix does that in a loop, decrementing cx
until it reached zero.
Hence it's just a bulk memory copy.
Now, since the source of repsw
is specified in terms of the ds
segment, the real reason why you're seeing the push/pop
to set ds
temporarily becomes clear - it's because the source of the data obviously lies in the code segment.
来源:https://stackoverflow.com/questions/53604760/whats-the-purpose-of-push-cs-pop-ds-before-a-rep-movsw