Bash script that analyzes report files

前端未结

关注

 3  2006

I have the following bash script which I will use to analyze all report files in the current directory:

#!/bin/bash    


# methods
analyzeStructuralErrors()
{


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  野的像风        
                
              
                            
                2021-01-22 09:10
              
            
            
                                                                       
Bash has one-dimensional arrays that are indexed by integers. Bash 4 adds associative arrays. That's it for data structures. AWK has one dimensional associative arrays and fakes its way through two dimensional arrays. If you need some kind of data structure more advanced than that, you'll need to use Python, for example, or some other language.

That said, here's a rough outline of how you might parse the data you've shown.

#!/bin/bash    

# methods
analyzeStructuralErrors()
{ 
    local f=$1
    local Xpat="Error Code for Issue X"
    local notXpat="Error Code for Issue [^X]"
    while read -r line
    do
        if [[ $line =~ $Xpat ]]
        then
            flag=true
        elif [[ $line =~ $notXpat ]]
        then
            flag=false
        elif $flag && [[ $line =~ , ]]
        then
            # columns could be overwritten if there are more than one X section
            IFS=, read -ra columns <<< "$line"
        elif $flag && [[ $line =~ - ]]
        then
            issues+=(line)
        else
            echo "unrecognized data line"
            echo "$line"
        fi
    done

    for issue in ${issues[@]}
    do
        IFS=- read -ra array <<< "$line"
        # do something with ${array[0]}, ${array[1]}, etc.
        # or iterate
        for field in ${array[@]}
        do
            # do something with $field
        done
    done
}

# main
find . -name "*_report*.txt" | while read -r f
do
    echo "Processing $f"
    analyzeStructuralErrors "$f"
done

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情话喂你        
                
              
                            
                2021-01-22 09:12
              
            
            
                                                                       
Below is a working awk implementation that uses it's pseudo multidimensional arrays. I've included sample output to show you how it looks.  I took the liberty to add a 'Count' column to denote how many times a certain "Issue" was hit for a given Error Code

#!/bin/bash

awk '
 /Error Code for Issue/ {
   errCode[currCode=$5]=$5
 }
 /^ +[0-9-]+$/ {
   split($0, tmpArr, "-")
   error[errCode[currCode],tmpArr[1]]++
 }
 END {
   for (code in errCode) {
     printf("Error Code: %s\n", code)
     for (item in error) {
       split(item, subscr, SUBSEP)
       if (subscr[1] == code) {
         printf("\tIssue: %s\tCount: %s\n", subscr[2], error[item])
       }
     }
   }
 }
' *_report*.txt


Output

$ ./report.awk
Error Code: B
        Issue:    1212  Count: 3
Error Code: X
        Issue:    2211  Count: 1
        Issue:    1143  Count: 2
Error Code: Y
        Issue:    2961  Count: 1
        Issue:    6666  Count: 1
        Issue:    5555  Count: 2
        Issue:    5911  Count: 1
        Issue:    4949  Count: 1
Error Code: Z
        Issue:    2222  Count: 1
        Issue:    1111  Count: 1
        Issue:    2323  Count: 2
        Issue:    3333  Count: 1
        Issue:    1212  Count: 1

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  深忆病人        
                
              
                            
                2021-01-22 09:17
              
            
            
                                                                       
As suggested by Dave Jarvis, awk will: 


handle this better than bash 
is fairly easy to learn
likely available wherever bash is available


I've never had to look farther than The AWK Manual. 

It would make things easier if you used a consistent field separator for both the list of column names and the data.  Perhaps you could do some pre-processing in a bash script using sed before feeding to awk.  Anyway, take a look at multi-dimensional arrays and reading multiple lines in the manual.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复