问题
Input is to be taken from a-z or A-Z and the input is ended by an asterisk *
.
We need to have the first and last Capital letters of the input characters as the output. Also, we should show the input we have taken each time. N.B. We take the inputs character by character, not as a string.
Test case 1: input: aAbCcP*
output: AP
Test case 2: input: ZabCBc*
output: ZB
I have written this code below, which satisfies Test Case 1, but not 2:
.MODEL
.STACK 100H
.DATA
STR DB 'Enter letters:$'
.CODE
MAIN PROC
MOV AX, @DATA
MOV DS, AX
LEA DX, STR
MOV AH, 9
INT 21H
cycle:
MOV AH, 1
INT 21H
CMP AL, '*'
JZ output
CMP AL, 'Z'
JA save
head:
CMP BL, 1
JZ save
MOV BL, 1
MOV BH, AL
clear:
XOR AL, AL
save:
MOV CH, AL
LOOP cycle
output:
MOV AH, 2
MOV DL, BH
INT 21H
MOV AH, 2
MOV DL, CH
INT 21H
MAIN ENDP
END MAIN
回答1:
First ask yourself these questions:
What are capitals?
If we don't consider accented characters, then capitals are characters with ASCII codes ranging from 65 to 90.Can I trust the user to only input characters from a-z or A-Z?
No you can't. You don't have control over what the user does at the keyboard, and that's why your program should take a defensive approach and test for capitals with something better than a singlecmp al, 'Z'
.What will be the result if the input didn't contain a single capital?
You could choose to print two spaces, or a descriptive message, or like I did display nothing at all.What will be the result if the input contains only one capital?
You could choose to print that one capital, or like I did display it twice because if you think of it, that single capital is at the same time the first occurence of a capital and also the last occurence of a capital.What input/output functions will I use?
For single character input you have a choice between DOS functions 01h, 06h, 07h, 08h, 0Ch, and 3Fh.
For single character output you have a choice between DOS functions 02h, 06h, and 40h.
If you're new to assembly then stick with the simpler ones and use functions 01h and 02h. Do consult the API reference before using any DOS function. And of course check with emu8086 whether it supports the function altogether!
You need to decide about all of the above in order to tackle the task. What is important, is that for every choice you make, you can defend your choice.
Below is my version of this task. For simplicity I'm using the tiny program model. See the ORG 256
directive on top? This program model has the major benefit of having all the segment registers pointing equally to your program (CS
= DS
= ES
= SS
).
The program runs 2 loops. The first loop runs until a capital is received. (Goes without saying that it stops earlier if the input contains an asterisk.) Because that capital is at the same time the first occurence of a capital and also the last occurence of a capital, I save it twice, both in DL
and DH
.
The second loop runs until an asterisk is received. Each time that a new capital comes along, it replaces what is written in DH
. When this loop finally ends, both DL
and DH
are displayed on screen and in this order of course.
The program exits with the preferred DOS function 4Ch to terminate a progam.
I've written some essential comments, refrained from adding redundant ones, and used descriptive names for the labels in the program. Do note that nice tabular layout. For readability it's crux.
ORG 256
Loop1: mov ah, 01h ; DOS.GetKeyboardCharacter
int 21h ; -> AL
cmp al, "*" ; Found end of input marker ?
je Done
cmp al, "A"
jb Loop1
cmp al, "Z"
ja Loop1
mov dl, al ; For now it's the first
mov dh, al ; AND the last capital
Loop2: mov ah, 01h ; DOS.GetKeyboardCharacter
int 21h ; -> AL
cmp al, "*" ; Found end of input marker ?
je Show
cmp al, "A"
jb Loop2
cmp al, "Z"
ja Loop2
mov dh, al ; This is the latest capital
jmp Loop2
Show: mov ah, 02h ; DOS.DisplayCharacter
int 21h ; -> (AL)
mov dl, dh
mov ah, 02h ; DOS.DisplayCharacter
int 21h ; -> (AL)
Done: mov ax, 4C00h ; DOS.TerminateWithReturnCode
int 21h
Example:
aZeRTy*
aZeRTy*ZT
It would be very disappointing if you took it the easy way and just copy/pasted my code. I've tried to explain it in great detail and hope that you learn a lot from it.
My solution is certainly not the only good solution for this task. You could e.g. first input all of the characters and store them in memory somewhere, after which you process these characters from memory similar to how I did it.
Please try to write a working version that does it in this alternative way.You can only get smarter! Happy programming.
回答2:
Your code is broken because you always fall through to save: MOV CH, AL
every iteration, so it can only work if the last capital is also the very last character of the whole input.
Single-step it with a debugger for a simple input like ABc*
to see how it goes wrong.
Also, you use loop
, which is like dec cx/jnz
. That makes no sense because there's no counter-based termination condition, and could potentially corrupt CH if CL was zero. You don't even initialize CX first! The loop
instruction is not the only way to loop; it's just a code-size peephole optimization you can use when it's convenient to use CX as a loop counter. Otherwise don't use it.
This is a simplified version of Sep's implementation, taking advantage of the fact that the input is guaranteed to be alphabetic, so we really can check for upper case as easily as c <= 'Z'
(after ruling out the '*'
terminator). We don't have to worry about inputs like 12ABcd7_
or spaces or newlines, which also have lower ASCII codes than the upper-case alphabetic range. Your cmp al,'Z'
/ ja
check was correct, it's just the code you were branching to that didn't have sane logic.
Even if you did want to strictly check c >= 'A' && c <= 'Z'
, that range check can be done with one branch using sub al,'A'
; cmp al,'Z'-'A'
; ja non_upper
instead of a pair of cmp/jcc branches. (That modifies the original, but if you save it in SI or something you could later restore it with lea ax, [si+'A']
)
You can also put a conditional branch at the bottom of the loop for both loops, instead of a jmp
at the bottom and an if() break
inside. Sep's code already did that for the first loop.
I agree with Sep that having 2 loops is easier than checking a flag every time you find a capital (to see if it's the first capital or not).
ORG 100h ; DOS .com is loaded with IP=100h, with CS=DS=ES=SS
; we don't actually do any absolute addressing so no real effect.
mov ah, 01h ; DOS.GetKeyboardCharacter
; AH=01 / int 21h doesn't modify AH so we only need this once
find_first_cap:
int 21h ; stdin -> AL
cmp al, '*' ; Found end of input marker ?
je Done ; if (c=='*') return; without print anything, we haven't found a capital yet
cmp al, 'Z'
ja find_first_cap
; fall through: AL <= 'Z' and we can assume it's a capital letter, not a digit or something.
mov dl, al ; For now it's the first
;mov dh, al ; AND the last capital
;mov ah, 01h ; DOS.GetKeyboardCharacter AH still = 01
;jmp loop2_entry ; we can let the first iteration set DH
Loop2: ; do {
cmp al, 'Z' ; assume all c <= 'Z' is a capital alphabetic character
ja loop2_entry
mov dh, al ; This is the latest capital
loop2_entry:
int 21h ; stdin -> AL
cmp al, '*'
jne Loop2 ; }while(c != '*');
Show: mov ah, 02h ; DOS.DisplayCharacter
int 21h ; AL -> stdout
mov dl, dh
; mov ah, 02h ; DOS.DisplayCharacter
int 21h ; AL -> stdout
Done: mov ax, 4C00h ; DOS.TerminateWithReturnCode
int 21h
At this point it's arguably not simpler, but is more optimized especially for code-size. That tends to happen when I write anything because that's the fun part. :P
Having a taken branch inside the loop for the non-capital case is arguably worse for performance. (In modern code for a P6-compatible CPU you'd probably use cmovbe esi, eax
instead of a conditional branch, because a conditional move is exactly what you want.)
Omitting the mov ah, XX
before an int 21h
because it's still set doesn't make your program more human-readable, but it is safe if you're careful to check the docs for each call to make sure they don't return anything in AH.
来源:https://stackoverflow.com/questions/56819605/finding-first-and-last-capital-letter-in-user-input