X86 NASM Assembly converting lower to upper and upper to lowercase characters

后端未结

关注

 5  1447

As i am pretty new to assembly, i have a few questions in regards to how i should convert from a lowercase to an uppercase if the user enters an uppercase letter or vice ver

相关标签:

5条回答

无人共我

2020-12-20 06:02

Jeff Duntemann wrote a book called Assembly Language Step by Step programming with linux .. which covers this topic very well on page 275 - 277.

there he shows by using the code sub byte [ebp+ecx], 20h you can then change lower-case to upper-case , please note that the buffer is using 1024 bytes which is a faster and better way to do this then the previous example located on page 268-269 where the buffer only has 8 bits at a time.

0 讨论(0)
发布评论:

提交评论
- 加载中...
一生所求

2020-12-20 06:03
Okay, but your string is not in edx, it's in [ecx] (or [In_Buffer]) (and it's only one useful character). To get a single character...
```
mov al, [ecx]
```
In a HLL you do "if some condition, execute this code". You might wonder how the CPU knows whether to execute the code or not. What we really do (HLLs do this for you) is "if NOT condition, skip over this code" (to a label). Experiment with it, you'll figure it out.

Exit cleanly, whatever path your code takes. You don't show this, but I assume you do it.

I just posted some info on sys_read here.

It's for a completely different program (adding two numbers - "hex" numbers) but the part about sys_read might interest you...
0 讨论(0)
发布评论:

提交评论
- 加载中...

别跟我提以往

2020-12-20 06:17

Here is a NASM program I hacked together that flips the case of a string, you basically need to loop over the string and check each character for boundaries in ascii and then add or subtract 0x20 to change the case (that is the distance between upper and lower in ascii). You can use the Linux ascii command to see a table of ascii values.

File: flipcase.asm

section     .text
global      _start                 ; Entry point for linker (ld)

  ; Linker entry point                                
_start:                                                         
    mov     rcx,len                ; Place length of message into rcx
    mov     rbp,msg                ; Place address of our msg into rbp    
    dec     rbp                    ; Adjust count to offset

  ; Go through the buffer and convert lowercase to uppercase characters:
upperScan:
    cmp byte [rbp+rcx],0x41        ; Test input char against uppercase 'A'                 
    jb lowerScan                   ; Not uppercase Ascii < 0x41 ('A') - jump below
    cmp byte [rbp+rcx],0x5A        ; Test input char against uppercase 'Z' 
    ja lowerScan                   ; Not uppercase Ascii > 0x5A ('Z') - jump above  
     ; At this point, we have a uppercase character
    add byte [rbp+rcx],0x20        ; Add 0x20 to get the lowercase Ascii value
    jmp Next                       ; Done, jump to next

lowerScan:
    cmp byte [rbp+rcx],0x61        ; Test input char against lowercase                 
    jb Next                        ; Not lowercase Ascii < 0x61 ('a') - jump below
    cmp byte [rbp+rcx],0x7A        ; Test input char against lowercase 'z'
    ja Next                        ; Not lowercase Ascii > 0x7A ('z') - jump below  
     ; At this point, we have a lowercase char
    sub byte [rbp+rcx],0x20        ; Subtract 0x20 to get the uppercase Ascii value
     ; Fall through to next        

Next:   
    dec rcx                        ; Decrement counter
    jnz upperScan                  ; If characters remain, loop back

  ; Write the buffer full of processed text to stdout:
Write:        
    mov     rbx,1                  ; File descriptor 1 (stdout)    
    mov     rax,4                  ; System call number (sys_write)
    mov     rcx,msg                ; Message to write        
    mov     rdx,len                ; Length of message to write
    int     0x80                   ; Call kernel interrupt
    mov     rax,1                  ; System call number (sys_exit)
    int     0x80                   ; Call kernel

section     .data

msg     db  'hELLO, wwwoRLD!',0xa  ; Our dear string
len     equ $ - msg                ; Length of our dear string

Then you can compile and run it with:
$> nasm -felf64 flipcase.asm && ld -melf_x86_64 -o flipcase flipcase.o && ./flipcase

0 讨论(0)

無奈伤痛

2020-12-20 06:24

Cute trick: if they type only letters, you can XOR their input letters with 0x20 to swap their case.

Then, if they can type more than letters, you just have to check each letter to see if it is alphabetical before XORing it. You can do that with a test to see if it lies in the ranges 'a' to 'z' or 'A' to 'Z', for example.

Alternately, you can just map each letter through a 256-element table which maps the characters the way you want them (this is usually how functions like toupper are implemented, for example).

0 讨论(0)
发布评论:

提交评论
- 加载中...
夕颜

2020-12-20 06:29
If you only support ASCII, then you can force lowercase using an OR 0x20
```
  or   eax, 0x20
```
Similarly, you can transform a letter to uppercase by clearing that bit:
```
  and  eax, 0xBF   ; or use ~0x20
```
And as nneonneo mentioned, the character case can be swapped using the XOR instruction:
```
  xor  eax, 0x20
```
That only works if eax is between 'a' and 'z' or 'A' and 'Z', so you'd have to compare and make sure you are in the range:
```
  cmp  eax, 'a'
  jl   .not-lower
  cmp  eax, 'z'
  jg   .not-lower
  or   eax, 0x20
.not-lower:
```
I used nasm syntax. You may want to make sure the jl and jg are correct too...

If you need to transform any international character, then that's a lot more complicated unless you can call a libc tolower() or toupper() function that accept Unicode characters.

As a fair question: why would it work? (asked by kuhaku)

ASCII characters (also ISO-8859-1) have the basic uppercase characters defined between 0x41 and 0x5A and the lowercase characters between 0x61 and 0x7A.

To force 4 into 6 and 5 into 7, you force bit 5 (0x20) to be set.

To go to uppercase, you do the opposite, you remove bit 5 so it becomes zero.
0 讨论(0)
发布评论:

提交评论
- 加载中...