why Seq.iter is 2x faster than for loop if target is for x64?

后端未结

关注

 2  1006

一生所求 2021-02-20 17:00

Disclaim: This is micro-benchmark, please do not comment quotes such as \"premature optimization is evil\" if you feel unhappy about the topic.

Examples are release targ

2条回答

佛祖请我去吃肉 (楼主)

2021-02-20 17:56
This isn't a complete answer, but hope it helps you to go further.

I can reproduce the behaviour using the same configuration. Here is a simpler example for profiling:
```
open System

let test1() =
    let ret = Array.zeroCreate 100
    let pool = {1 .. 1000000}    
    for x in pool do
        for _ in 1..50 do
            for y in 1..200 do
                ret.[2] <- x + y

let test2() =
    let ret = Array.zeroCreate 100
    let pool = {1 .. 1000000}    
    Seq.iter (fun x -> 
        for _ in 1..50 do
            for y in 1..200 do
                ret.[2] <- x + y) pool

let time f =
    let sw = new Diagnostics.Stopwatch()
    sw.Start()
    let result = f() 
    sw.Stop()
    Console.WriteLine(sw.Elapsed)
    result

[]
let main argv =
    time test1
    time test2
    0
```
In this example, Seq.iter and for x in pool is executed once but there is still 2x time difference between test1 and test2:
```
00:00:06.9264843
00:00:03.6834886
```
Their ILs are very similar, so compiler optimization isn't a problem. It seems that x64 jitter fails to optimize test1 though it is able to do so with test2. Interestingly, if I refactor nested for loops in test1 as a function, JIT optimization succeeds again:
```
let body (ret: _ []) x =
    for _ in 1..50 do
        for y in 1..200 do
            ret.[2] <- x + y

let test3() =
    let ret = Array.zeroCreate 100
    let pool = {1..1000000}    
    for x in pool do
        body ret x

// 00:00:03.7012302
```
When I disable JIT optimization using the technique described here, execution times of these functions are comparable.

Why x64 jitter fails in the particular example, I don't know. You can disassemble optimized jitted code to compare ASM instructions line by line. Maybe someone with good ASM knowledge can find out their differences.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...