why does F# inline cause 11x performance improvement

后端 未结 2 1240
忘掉有多难
忘掉有多难 2021-02-02 12:58

I am working on some heavy cpu bound problem. I see a big performance improvement when I use the inline keyword. I create a dictionary from the standard .net librar

相关标签:
2条回答
  • 2021-02-02 13:19

    Type specialization

    Without inline, you are using generic comparison which is very inefficient. With inline, the genericity is removed and int comparison is used directly.

    0 讨论(0)
  • 2021-02-02 13:28

    I can reproduce the behavior on my machine with 3x performance boost after adding inline keyword.

    Decompiling two versions side by side under ILSpy gives almost identical C# code. The notable difference is in two equality tests:

    // Version without inline
    bool IEqualityComparer<Program.Pair<a>>.System-Collections-Generic-IEqualityComparer(Program.Pair<a> x, Program.Pair<a> y)
    {
        a v@ = x.v@;
        a v@2 = y.v@;
        if (LanguagePrimitives.HashCompare.GenericEqualityIntrinsic<a>(v@, v@2))
        {
            a w@ = x.w@;
            a w@2 = y.w@;
            return LanguagePrimitives.HashCompare.GenericEqualityIntrinsic<a>(w@, w@2);
        }
        return false;
    }
    
    // Version with inline
    bool IEqualityComparer<Program.Pair<int>>.System-Collections-Generic-IEqualityComparer(Program.Pair<int> x, Program.Pair<int> y)
    {
        int v@ = x.v@;
        int v@2 = y.v@;
        if (v@ == v@2)
        {
            int w@ = x.w@;
            int w@2 = y.w@;
            return w@ == w@2;
        }
        return false;
    }
    

    The generic equality is much less efficient than the specialized version.

    I also noticed the huge difference in the amount of Gen 0 GC with the inlined code and non inlined code.

    Could someone explain why there is such a huge difference?

    Taking a look at GenericEqualityIntrinsic function in F# source code:

    let rec GenericEqualityIntrinsic (x : 'T) (y : 'T) : bool = 
        fsEqualityComparer.Equals((box x), (box y))
    

    It does boxing on arguments, which explains the significant amount of garbage in your first example. When GC comes into play too often, it will slow down the computation dramatically. The second example (using inline) produces almost no garbage when Pair is struct.

    That said, it is the expected behavior of inline keyword when a specialized version is used at the call site. My suggestion is always to try to optimize and measure your code on the same benchmarks.

    You may be interested in a very similar thread Why is this F# code so slow?.

    0 讨论(0)
提交回复
热议问题