问题:
I wrote these two solutions for Project Euler Q14 , in assembly and in C++. 我用汇编语言和C ++语言为Euler Q14项目编写了这两种解决方案。 They are the same identical brute force approach for testing the Collatz conjecture . 它们是用于测试Collatz猜想的相同相同的蛮力方法。 The assembly solution was assembled with 组装解决方案与
nasm -felf64 p14.asm && gcc p14.o -o p14
The C++ was compiled with C ++使用
g++ p14.cpp -o p14
Assembly, p14.asm
汇编, p14.asm
section .data
fmt db "%d", 10, 0
global main
extern printf
section .text
main:
mov rcx, 1000000
xor rdi, rdi ; max i
xor rsi, rsi ; i
l1:
dec rcx
xor r10, r10 ; count
mov rax, rcx
l2:
test rax, 1
jpe even
mov rbx, 3
mul rbx
inc rax
jmp c1
even:
mov rbx, 2
xor rdx, rdx
div rbx
c1:
inc r10
cmp rax, 1
jne l2
cmp rdi, r10
cmovl rdi, r10
cmovl rsi, rcx
cmp rcx, 2
jne l1
mov rdi, fmt
xor rax, rax
call printf
ret
C++, p14.cpp C ++,p14.cpp
#include <iostream>
using namespace std;
int sequence(long n) {
int count = 1;
while (n != 1) {
if (n % 2 == 0)
n /= 2;
else
n = n*3 + 1;
++count;
}
return count;
}
int main() {
int max = 0, maxi;
for (int i = 999999; i > 0; --i) {
int s = sequence(i);
if (s > max) {
max = s;
maxi = i;
}
}
cout << maxi << endl;
}
I know about the compiler optimizations to improve speed and everything, but I don't see many ways to optimize my assembly solution further (speaking programmatically not mathematically). 我知道可以提高速度和所有方面的编译器优化,但是我看不到有很多方法可以进一步优化我的汇编解决方案(以编程方式而不是数学方式)。
The C++ code has modulus every term and division every even term, where assembly is only one division per even term. C ++代码的每项均具有模数,而每偶数项均具有模数,其中汇编仅是每偶数项除一。
But the assembly is taking on average 1 second longer than the C++ solution. 但是汇编程序比C ++解决方案平均要花费1秒钟的时间。 Why is this? 为什么是这样? I am asking out of mainly curiosity. 我主要是出于好奇。
Execution times 执行时间
My system: 64 bit Linux on 1.4 GHz Intel Celeron 2955U (Haswell microarchitecture). 我的系统:1.4 GHz Intel Celeron 2955U(Haswell微体系结构)上的64位Linux。
g++
(unoptimized): avg 1272 msg++
(未优化):平均1272毫秒g++ -O3
avg 578 msg++ -O3
平均578毫秒original asm (div) avg 2650 ms 原始ASM(div)平均2650毫秒
Asm (shr)
avg 679 msAsm (shr)
平均679毫秒@johnfound asm , assembled with nasm avg 501 ms @johnfound asm ,与nasm平均501毫秒组合
@hidefromkgb asm avg 200 ms @hidefromkgb asm平均200毫秒
@hidefromkgb asm optimized by @Peter Cordes avg 145 ms @hidefromkgb asm由@Peter Cordes优化 145毫秒
@Veedrac C++ avg 81 ms with
-O3
, 305 ms with-O0
@Veedrac C ++使用-O3
平均81 ms,使用-O0
平均305 ms
解决方案:
参考一: https://stackoom.com/question/2jKAk/用于测试Collat-z猜想的C-代码比手写汇编要快-为什么参考二: https://oldbug.net/q/2jKAk/C-code-for-testing-the-Collatz-conjecture-faster-than-hand-written-assembly-why
来源:oschina
链接:https://my.oschina.net/u/3797416/blog/4326715