What is the overhead for a method call in a loop?

问题

I’ve been working on a C# maze generator for a while which can generate mazes of like 128000x128000 pixels. All memory usage is optimized already so I’m currently looking at speeding the generation up.

A problem (well more off an interest point) I found was the following (just some example code to illustrate the problem):

This code runs in about 1.4 seconds on my machine when pixelChanged is null:

public void Go()
{
    for (int i = 0; i < bitjes.Length; i++)
    {
        BitArray curArray = bitjes[i];
        for (int y = 0; y < curArray.Length; y++)
        {
            curArray[y] = !curArray[y];
            GoDrawPixel(i, y, false);
        }
    }
}

public void GoDrawPixel(int i, int y, Boolean enabled)
{
    if (pixelChanged != null)
    {
        pixelChanged.Invoke(new PixelChangedEventArgs(i, y, enabled));
    }
}

Where the following code runs actually 0.4 seconds faster

public void Go()
{
    for (int i = 0; i < bitjes.Length; i++)
    {
        BitArray curArray = bitjes[i];
        for (int y = 0; y < curArray.Length; y++)
        {
            curArray[y] = !curArray[y];
            if (pixelChanged != null)
            {
                pixelChanged.Invoke(new PixelChangedEventArgs(i, y, false));
            }
        }
    }
}

It seems that when just calling an “empty” method takes about 20% of the cpu this algorithm uses. Isn’t this strange? I’ve tried to compile the solution in debug and release mode but haven’t found any noticeable differences.

What this means is that every method call I have in this loop will slow my code down by about 0.4 seconds. Since the maze generator code currently consist of a lot of seperate method calls that excecute different actions this starts to get a substantial ammount.

I've also checked google and other posts on Stack Overflow but haven't really found a solution yet.

Is it possible to automatically optimize code like this? (Maybe with things like project Roslyn???) or should I place everything together in one big method?

Edit: I'm also interested in maybe some analysis on the JIT/CLR code differences in these 2 cases. (So where this problem actually comes from)

Edit2: All code was compiled in release mode

回答1:

It is a problem, JIT has an inline optimization for methods (where the whole method code is actually injected inside the calling parent code) but this only happens for methods that are compiled to 32 bytes or less. I have no idea why the 32 byte limitation exists and would also like to see an 'inline' keyword in C# like in C/C++ exactly for these issues.

回答2:

The first thing I would try would be making it static rather than instance:

public static void GoDrawPixel(PixelChangedEventHandler pixelChanged,
    int x, int y, bool enabled)
{
    if (pixelChanged != null)
    {
        pixelChanged.Invoke(new PixelChangedEventArgs(x, y, enabled));
    }
}

this changes a few things:

the stack semantics remain comparable (it loads a reference, 2 ints and a bool either way)
the callvirt becomes call - which avoids some minor overheads
the ldarg0/ldfld pair (this.pixelChanged) becomes a single ldarg0

the next thing I would look at is PixelChangedEventArgs; it might be that passing that as a struct is cheaper if it avoids a lot of allocations; or perhaps just:

pixelChanged(x, y, enabled);

(raw arguments rather than a wrapper object - requires a change in signature)

回答3:

Is this in debug or release mode? Method calls are fairly expensive, but they may be inlined when you build/run it in release mode. It will not get any optimization from the compiler in debug mode.

回答4:

The main overhead is, as Marc said, making that virtual call and passing the arguments. Can the value of PixelChanged change during the execution of method? If not this might work (I'm not completely sure JIT optimizes the empty action delegate into a nop, you'll have to test that on your own (if it doesn't I'll just disregard good practices here and just make to calls, one with pixelChanged.Invoke called and one without (inline) and just call whatever suits best... after all sometimes you have to make the code a little less elegant in order for it to be fast).

public void Go()
{
  if (pixelChanged != null)  
     GoPixelGo((x,y,z) => { });  
  else
     GoPixelGo((i, y, enabled) => pixelChanged.Invoke(i, y, enabled));
}

public void GoPixelGo(Action<int, int, bool> action)
{
  for (int i = 0; i < bitjes.Length; i++)
  {
      BitArray curArray = bitjes[i];
      for (int y = 0; y < curArray.Length; y++)
      {
         curArray[y] = !curArray[y];
         action(i,y, false);
      }
  }
}

来源：https://stackoverflow.com/questions/13135759/what-is-the-overhead-for-a-method-call-in-a-loop

标签

performance

compiler-construction