Speed up Matrix Addition in C#

前端 未结 15 1157
北荒
北荒 2021-02-05 22:48

I\'d like to optimize this piece of code :

public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{            
        for (int x = 0;         


        
相关标签:
15条回答
  • 2021-02-05 22:57

    I'm not sure if it's faster but you may write something like;

    public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
    {            
            Byte pixelValue;
            for (int x = 0; x < Width; x++)
            {
                for (int y = 0; y < Height; y++)
                {
                    pixelValue = image.GetPixel(x, y).B;
                    this.sumOfPixelValues[x, y] += pixelValue;
                    this.sumOfPixelValuesSquared[x, y] += pixelValue * pixelValue;
                }
            }
    }
    
    0 讨论(0)
  • 2021-02-05 23:02

    Although it's a micro-optimization and thus may not add much you might want to study what the likelihood is of getting a zero when you do

    Byte  pixelValue = image.GetPixel(x, y).B;
    

    Clearly, if pixelValue = 0 then there's no reason to do the summations so your routine might become

    public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
      {
      for (int x = 0; x < Width; x++)
        {
        for (int y = 0; y < Height; y++)
          {
           Byte  pixelValue = image.GetPixel(x, y).B;
    
           if(pixelValue != 0)
             {
             this.sumOfPixelValues[x, y] += pixelValue;
             this.sumOfPixelValuesSquared[x, y] += pixelValue * pixelValue;
             }}}}
    

    However, the question is how often you're going to see pixelValue=0, and whether the saving on the compute-and-store will offset the cost of the test.

    0 讨论(0)
  • 2021-02-05 23:07

    Despite using unsafe code, GetPixel may well be the bottleneck here. Have you looked at ways of getting all the pixels in the image in one call rather than once per pixel? For instance, Bitmap.LockBits may be your friend...

    On my netbook, a very simply loop iterating 640 * 480 * 200 times only take about 100 milliseconds - so if you're finding it's all going slowly, you should take another look at the bit inside the loop.

    Another optimisation you might want to look at: avoid multi-dimensional arrays. They're significantly slower than single-dimensional arrays.

    In particular, you can have a single-dimensional array of size Width * Height and just keep an index:

    int index = 0;
    for (int x = 0; x < Width; x++)
    {
        for (int y = 0; y < Height; y++)
        {
            Byte pixelValue = image.GetPixel(x, y).B;
            this.sumOfPixelValues[index] += pixelValue;
            this.sumOfPixelValuesSquared[index] += pixelValue * pixelValue;
            index++;
        }
    }
    

    Using the same simple test harness, adding a write to a 2-D rectangular array took the total time of looping over 200 * 640 * 480 up to around 850ms; using a 1-D rectangular array took it back down to around 340ms - so it's somewhat significant, and currently you've got two of those per loop iteration.

    0 讨论(0)
  • 2021-02-05 23:08

    Matrix addition is of course an n^2 operation but you can speed it up by using unsafe code or at least using jagged arrays instead of multidimensional.

    0 讨论(0)
  • matrix's addition complexity is O(n^2), in number of additions.

    However, since there are no intermediate results, you can parallelize the additions using threads:

    1. it easy to proof that the resulting algorithm will be lock-free
    2. you can tune the optimal number of threads to use
    0 讨论(0)
  • 2021-02-05 23:11

    This is a classic case of micro-optimisation failing horribly. You're not going to get anything from looking at that loop. To get real speed benefits you need to start off by looking at the big picture:-

    • Can you asynchronously preload image[n+1] whilst processing image[n]?
    • Can you load just the B channel from the image? This will decrease memory bandwidth?
    • Can you load the B value and update the sumOfPixelValues(Squared) arrays directly, i.e. read the file and update instead of read file, store, read, update? Again, this decreases memory bandwidth.
    • Can you use one dimensional arrays instead of two dimensional? Maybe create your own array class that works either way.
    • Perhaps you could look into using Mono and the SIMD extensions?
    • Can you process the image in chunks and assign them to idle CPUs in a multi-cpu environment?

    EDIT:

    Try having specialised image accessors so you're not wasting memory bandwidth:

    public Color GetBPixel (int x, int y)
    {
        int offsetFromOrigin = (y * this.stride) + (x * 3);
        unsafe
        {
            return this.imagePtr [offsetFromOrigin + 1];
        }
    }
    

    or, better still:

    public Color GetBPixel (int offset)
    {
        unsafe
        {
            return this.imagePtr [offset + 1];
        }
    }
    

    and use the above in a loop like:

    for (int start_offset = 0, y = 0 ; y < Height ; start_offset += stride, ++y)
    {
       for (int x = 0, offset = start_offset ; x < Width ; offset += 3, ++x)
       {
          pixel = GetBPixel (offset);
          // do stuff
       }
    }
    
    0 讨论(0)
提交回复
热议问题