Speed up Matrix Addition in C#

前端 未结 15 1187
北荒
北荒 2021-02-05 22:48

I\'d like to optimize this piece of code :

public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{            
        for (int x = 0;         


        
15条回答
  •  慢半拍i
    慢半拍i (楼主)
    2021-02-05 23:19

    System.Drawing.Color is a structure, which on current versions of .NET kills most optimizations. Since you're only interested in the blue component anyway, use a method that only gets the data you need.

    public byte GetPixelBlue(int x, int y)
    {
        int offsetFromOrigin = (y * this.stride) + (x * 3);
        unsafe
        {
            return this.imagePtr[offsetFromOrigin];
        }
    }
    

    Now, exchange the order of iteration of x and y:

    public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
    {            
        for (int y = 0; y < Height; y++)
        {
            for (int x = 0; x < Width; x++)
            {
                Byte  pixelValue = image.GetPixelBlue(x, y);
                this.sumOfPixelValues[y, x] += pixelValue;
                this.sumOfPixelValuesSquared[y, x] += pixelValue * pixelValue;
            }
        }
    }
    

    Now you're accessing all values within a scan line sequentially, which will make much better use of CPU cache for all three matrices involved (image.imagePtr, sumOfPixelValues, and sumOfPixelValuesSquared. [Thanks to Jon for noticing that when I fixed access to image.imagePtr, I broke the other two. Now the output array indexing is swapped to keep it optimal.]

    Next, get rid of the member references. Another thread could theoretically be setting sumOfPixelValues to another array midway through, which does horrible horrible things to optimizations.

    public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
    {          
        uint [,] sums = this.sumOfPixelValues;
        ulong [,] squares = this.sumOfPixelValuesSquared;
        for (int y = 0; y < Height; y++)
        {
            for (int x = 0; x < Width; x++)
            {
                Byte  pixelValue = image.GetPixelBlue(x, y);
                sums[y, x] += pixelValue;
                squares[y, x] += pixelValue * pixelValue;
            }
        }
    }
    

    Now the compiler can generate optimal code for moving through the two output arrays, and after inlining and optimization, the inner loop can step through the image.imagePtr array with a stride of 3 instead of recalculating the offset all the time. Now an unsafe version for good measure, doing the optimizations that I think .NET ought to be smart enough to do but probably isn't:

    unsafe public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
    {          
        byte* scanline = image.imagePtr;
        fixed (uint* sums = &this.sumOfPixelValues[0,0])
        fixed (uint* squared = &this.sumOfPixelValuesSquared[0,0])
        for (int y = 0; y < Height; y++)
        {
            byte* blue = scanline;
            for (int x = 0; x < Width; x++)
            {
                byte pixelValue = *blue;
                *sums += pixelValue;
                *squares += pixelValue * pixelValue;
                blue += 3;
                sums++;
                squares++;
            }
            scanline += image.stride;
        }
    }
    

提交回复
热议问题