x86

How does CPU perform operation that manipulate data that's less than a word size

≯℡__Kan透↙ 提交于 2021-02-05 11:39:49
问题 I had read that when CPU read from memory, it will read word size of memory at once (like 4 bytes or 8 bytes). How can CPU achieve something like: mov BYTE PTR [rbp-20], al where it copies only one byte of data from al to the stack. (given the data bus width is like 64 bit wide) Will be great if anyone can provide information on how it's implemented on the hardware level. And also, as we all know that when CPU execute program, it has program counter or instruction pointer that points to the

What is the avx2 instruction to store 8 integers?

狂风中的少年 提交于 2021-02-05 11:32:06
问题 I want to store the 8 integers from a __m256i variable to an array of 8 x 32 bit int s. I thought the instruction for that would be _mm256_store_epi32 , but I get an error that this instruction doesn't even exist! 回答1: Have a look at the Intel Intrinsics Guide. Depending on whether your destination is aligned, you need _mm256_store_si256 or _mm256_storeu_si256. 来源: https://stackoverflow.com/questions/43304021/what-is-the-avx2-instruction-to-store-8-integers

How to count matches using compare + je?

僤鯓⒐⒋嵵緔 提交于 2021-02-05 09:23:09
问题 I am writing a code that counts how many words are in a string. How can I increase a register using je? For example: cmp a[bx+1],00h je inc cx 回答1: je is a conditional jump . Unlike ARM, x86 can't directly predicate another single instruction based on an arbitrary condition. There's no single machine instruction that can do anything like je inc cx or ARM-style inceq cx . Instead you need to build the logic yourself by conditionally branching over other instruction(s). If you want to increase

How to count matches using compare + je?

痞子三分冷 提交于 2021-02-05 09:22:09
问题 I am writing a code that counts how many words are in a string. How can I increase a register using je? For example: cmp a[bx+1],00h je inc cx 回答1: je is a conditional jump . Unlike ARM, x86 can't directly predicate another single instruction based on an arbitrary condition. There's no single machine instruction that can do anything like je inc cx or ARM-style inceq cx . Instead you need to build the logic yourself by conditionally branching over other instruction(s). If you want to increase

Why is scanf returning 0.000000 when it is supplied with a double?

久未见 提交于 2021-02-05 09:12:59
问题 I have the following assembly code (written for NASM on Linux): ; This code has been generated by the 7Basic ; compiler <http://launchpad.net/7basic> extern printf extern scanf SECTION .data printf_f: db "%f",10,0 scanf_f: db "%f",0 SECTION .bss v_0 resb 8 SECTION .text global main main: push ebp mov ebp,esp push v_0 ; load the address of the variable push scanf_f ; push the format string call scanf ; call scanf() add esp,8 push dword [v_0+4] ; load the upper-half of the double push dword [v

Why is scanf returning 0.000000 when it is supplied with a double?

不想你离开。 提交于 2021-02-05 09:09:22
问题 I have the following assembly code (written for NASM on Linux): ; This code has been generated by the 7Basic ; compiler <http://launchpad.net/7basic> extern printf extern scanf SECTION .data printf_f: db "%f",10,0 scanf_f: db "%f",0 SECTION .bss v_0 resb 8 SECTION .text global main main: push ebp mov ebp,esp push v_0 ; load the address of the variable push scanf_f ; push the format string call scanf ; call scanf() add esp,8 push dword [v_0+4] ; load the upper-half of the double push dword [v

Why is scanf returning 0.000000 when it is supplied with a double?

我的梦境 提交于 2021-02-05 09:08:52
问题 I have the following assembly code (written for NASM on Linux): ; This code has been generated by the 7Basic ; compiler <http://launchpad.net/7basic> extern printf extern scanf SECTION .data printf_f: db "%f",10,0 scanf_f: db "%f",0 SECTION .bss v_0 resb 8 SECTION .text global main main: push ebp mov ebp,esp push v_0 ; load the address of the variable push scanf_f ; push the format string call scanf ; call scanf() add esp,8 push dword [v_0+4] ; load the upper-half of the double push dword [v

What does it mean that “registers are preserved across function calls”?

自闭症网瘾萝莉.ら 提交于 2021-02-05 09:03:38
问题 From this question, What registers are preserved through a linux x86-64 function call, it says that the following registers are saved across function calls: r12, r13, r14, r15, rbx, rsp, rbp So, I went ahead and did a test with the following: .globl _start _start: mov $5, %r12 mov $5, %r13 mov $5, %r14 mov $5, %r15 call get_array_size mov $60, %eax syscall get_array_size: mov $0, %r12 mov $0, %r13 mov $0, %r14 mov $0, %r15 ret And, I was thinking that after the call get_array_size that my

MMX Register Speed vs Stack for Unsigned Integer Storage

时光怂恿深爱的人放手 提交于 2021-02-05 08:56:49
问题 I am contemplating an implementation of SHA3 in pure assembly. SHA3 has an internal state of 17 64 bit unsigned integers, but because of the transformations it uses, the best case could be achieved if I had 44 such integers available in the registers. Plus one scratch register possibly. In such a case, I would be able to do the entire transform in the registers. But this is unrealistic, and optimisation is possible all the way down to even just a few registers. Still, more is potentially

x86 NASM Indirect Far Jump In Real Mode

拈花ヽ惹草 提交于 2021-02-05 08:51:15
问题 I have been messing around with a multi-stage bootloader and I have got all of my code to work, except for the last part: The Jump . I have gotten this code to work out before now but I wanted to make it more modular by replacing this line: jmp 0x7E0:0 With this one: jmp far [Stage2Read + SectorReadParam.bufoff] Instead of hard coding where the code will load in, I wanted to do an indirect jump to it. Here's the rest of my code: ; This is stage 1 of a multi-stage bootloader bits 16 org 0x7C00