What's the purpose of PUSH CS / POP DS before a REP MOVSW?

萝らか妹 提交于 2019-12-11 03:09:38

问题


Why in below code we push code segment (PUSH CS) and then popping it into the data segment (POP DS)?

I am giving these lines explicitly as line1 and line2. Please let me know how MOVSW is working here.

IF  HIGHMEMORY
PUSH DS
MOV BX, DS
ADD BX, 10H
MOV ES, BX
PUSH CS.           ;line1
POP DS.            ;line2
XOR SI, SI
MOV DI, SI
MOV CX, OFFSET SYSSIZE  +  1
SHR CX, 1
REP MOVSW.    ;line3
POP DS
PUSH ES
MOV AX, OFFSET SECONDRELOCATION
PUSH AX
AAA PROC FAR
RET
AAA ENDP 
SECONDRELOCATION:
more code here.............. 

回答1:


Temporarily setting DS = CS and then restoring it looks like an inefficient alternative to using a CS override prefix on rep movsw.

A segment override can change the source for movsw from DS:SI to CS:SI. (The destination of ES:DI can't be overriden).

(update: on original 8086/8088, there was a hardware "bug" / anomaly: on resuming from an interrupt that happened during a REP-string instruction, IP would point to the last prefix of an instruction, not the first. So depending on the encoding, cs rep movsw would either decode as rep movsw or cs movsw. See @MichaelPetch's comments, and https://www.pcjs.org/pubs/pc/reference/intel/8086/ for more 8086 errata and anomalies that have been fixed in later x86 CPUs.)


This code is doing a memcpy(dst, code_segment, sizeof(code_segment)), where the dst segment:offset is (BX + 16):0. The instructions before rep movsw set up DS = BX+16 and set DI=0.

Then the code jumps to the new location, using a far ret after pushing the destination segment (ES) and an offset within it. (push offset SECONDRELOCATION would work, but only on 186+. This DOS code needs to maintain backwards compat with 8086, unfortunately.)

Apparently this assembler doesn't support syntax like ret far or retf, so they have to assemble a far ret instruction by declaring a proc far around the ret instruction. AAA is a very weird name for that proc, because aaa is also a valid x86 instruction mnemonic (ASCII Adjust after Addition).

So execution continues at the SECONDRELOCATION: label in the copy of the code we just made.


(size+1) / 2 rounds up to a whole number of words, unless the size wraps in which case it copies zero bytes instead of 64k. (Unlike loop, rep checks the count before executing once.)

Doing the shr at runtime is also dumb, and could have been done at assemble time using something like mov cx, (offset endcode - startcode + 1) / 2. (You probably can't divide an offset result by 2, but you can find the distance between two labels in the same section at assemble time.)

Anyway, probably the point is to relocate the code into HIGHMEM, leaving low memory free for use by programs that can't use HIMEM.




回答2:


The sequence push cs, pop ds is simply a way to set your data segment to the same value as your code segment.

It's similar to using push ax, pop bx instead of mov bx, ax, other than the fact that it uses memory and may have a different effect on certain flags, something I couldn't be bothered checking when my intent is only to provide an example :-)

One reason you would do this dates back to the old days of x86 segmented architecture (as opposed to the more modern selectors), which is rarely used nowadays. The x86 had various memory models like, tiny, small, compact, medium, large and huge.

These were basically variations on the sizes and quantities of code and data segments that you could use and, from memory, tiny meant that you had one segment that contained both code and data.

Hence cs and ds should be set to the same value so that all instructions operated on that segment by default.

In your particular case, you're saving ds, setting it to the same value as cs then restoring it. See below for a more likely explanation of why.


As to the workings of movsw, it simply copies a single word value from the memory at ds:si to address es:di, updating the pointers afterward (increment or decrement, depending on the setting of the direction flag).

The rep prefix does that in a loop, decrementing cx until it reached zero.

Hence it's just a bulk memory copy.


Now, since the source of repsw is specified in terms of the ds segment, the real reason why you're seeing the push/pop to set ds temporarily becomes clear - it's because the source of the data obviously lies in the code segment.



来源:https://stackoverflow.com/questions/53604760/whats-the-purpose-of-push-cs-pop-ds-before-a-rep-movsw

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!