Speed up Matrix Addition in C#

前端未结

关注

 15  1159

I\'d like to optimize this piece of code :

public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{            
        for (int x = 0;


                      
              相关标签:


      
      
        
          15条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  我寻月下人不归        
                
              
                            
                2021-02-05 23:16
              
            
            
                                                                       
Where are images stored?  If each is on disk, then a bit of your processing time issue may be in fetching them from the disk.  You might examine this to see if it is an issue, and if so, then rewrite to pre-fetch the image data so that the array procesing code does not have to wait for the data... 

If the overall application logic will allow it (Is each matrix addition independant, or dependant on output of a previous matrix addition?)  If they are independant, I'd examine executing them all on separate threads, or in parallel.. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2021-02-05 23:19
              
            
            
                                                                       
System.Drawing.Color is a structure, which on current versions of .NET kills most optimizations.  Since you're only interested in the blue component anyway, use a method that only gets the data you need.

public byte GetPixelBlue(int x, int y)
{
    int offsetFromOrigin = (y * this.stride) + (x * 3);
    unsafe
    {
        return this.imagePtr[offsetFromOrigin];
    }
}


Now, exchange the order of iteration of x and y:

public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{            
    for (int y = 0; y < Height; y++)
    {
        for (int x = 0; x < Width; x++)
        {
            Byte  pixelValue = image.GetPixelBlue(x, y);
            this.sumOfPixelValues[y, x] += pixelValue;
            this.sumOfPixelValuesSquared[y, x] += pixelValue * pixelValue;
        }
    }
}


Now you're accessing all values within a scan line sequentially, which will make much better use of CPU cache for all three matrices involved (image.imagePtr, sumOfPixelValues, and sumOfPixelValuesSquared.  [Thanks to Jon for noticing that when I fixed access to image.imagePtr, I broke the other two.  Now the output array indexing is swapped to keep it optimal.]

Next, get rid of the member references.  Another thread could theoretically be setting sumOfPixelValues to another array midway through, which does horrible horrible things to optimizations.

public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{          
    uint [,] sums = this.sumOfPixelValues;
    ulong [,] squares = this.sumOfPixelValuesSquared;
    for (int y = 0; y < Height; y++)
    {
        for (int x = 0; x < Width; x++)
        {
            Byte  pixelValue = image.GetPixelBlue(x, y);
            sums[y, x] += pixelValue;
            squares[y, x] += pixelValue * pixelValue;
        }
    }
}


Now the compiler can generate optimal code for moving through the two output arrays, and after inlining and optimization, the inner loop can step through the image.imagePtr array with a stride of 3 instead of recalculating the offset all the time.  Now an unsafe version for good measure, doing the optimizations that I think .NET ought to be smart enough to do but probably isn't:

unsafe public void PopulatePixelValueMatrices(GenericImage image,int Width, int Height)
{          
    byte* scanline = image.imagePtr;
    fixed (uint* sums = &this.sumOfPixelValues[0,0])
    fixed (uint* squared = &this.sumOfPixelValuesSquared[0,0])
    for (int y = 0; y < Height; y++)
    {
        byte* blue = scanline;
        for (int x = 0; x < Width; x++)
        {
            byte pixelValue = *blue;
            *sums += pixelValue;
            *squares += pixelValue * pixelValue;
            blue += 3;
            sums++;
            squares++;
        }
        scanline += image.stride;
    }
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  余生分开走        
                
              
                            
                2021-02-05 23:19
              
            
            
                                                                       
The only possible way I can think of to speed it up would be to try do some of the additions in parallel, which with your size might be beneficial over the threading overhead.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
3
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复