Bash script that analyzes report files

前端 未结 3 2012
伪装坚强ぢ
伪装坚强ぢ 2021-01-22 08:54

I have the following bash script which I will use to analyze all report files in the current directory:

#!/bin/bash    


# methods
analyzeStructuralErrors()
{ 
         


        
相关标签:
3条回答
  • 2021-01-22 09:10

    Bash has one-dimensional arrays that are indexed by integers. Bash 4 adds associative arrays. That's it for data structures. AWK has one dimensional associative arrays and fakes its way through two dimensional arrays. If you need some kind of data structure more advanced than that, you'll need to use Python, for example, or some other language.

    That said, here's a rough outline of how you might parse the data you've shown.

    #!/bin/bash    
    
    # methods
    analyzeStructuralErrors()
    { 
        local f=$1
        local Xpat="Error Code for Issue X"
        local notXpat="Error Code for Issue [^X]"
        while read -r line
        do
            if [[ $line =~ $Xpat ]]
            then
                flag=true
            elif [[ $line =~ $notXpat ]]
            then
                flag=false
            elif $flag && [[ $line =~ , ]]
            then
                # columns could be overwritten if there are more than one X section
                IFS=, read -ra columns <<< "$line"
            elif $flag && [[ $line =~ - ]]
            then
                issues+=(line)
            else
                echo "unrecognized data line"
                echo "$line"
            fi
        done
    
        for issue in ${issues[@]}
        do
            IFS=- read -ra array <<< "$line"
            # do something with ${array[0]}, ${array[1]}, etc.
            # or iterate
            for field in ${array[@]}
            do
                # do something with $field
            done
        done
    }
    
    # main
    find . -name "*_report*.txt" | while read -r f
    do
        echo "Processing $f"
        analyzeStructuralErrors "$f"
    done
    
    0 讨论(0)
  • 2021-01-22 09:12

    Below is a working awk implementation that uses it's pseudo multidimensional arrays. I've included sample output to show you how it looks. I took the liberty to add a 'Count' column to denote how many times a certain "Issue" was hit for a given Error Code

    #!/bin/bash
    
    awk '
     /Error Code for Issue/ {
       errCode[currCode=$5]=$5
     }
     /^ +[0-9-]+$/ {
       split($0, tmpArr, "-")
       error[errCode[currCode],tmpArr[1]]++
     }
     END {
       for (code in errCode) {
         printf("Error Code: %s\n", code)
         for (item in error) {
           split(item, subscr, SUBSEP)
           if (subscr[1] == code) {
             printf("\tIssue: %s\tCount: %s\n", subscr[2], error[item])
           }
         }
       }
     }
    ' *_report*.txt
    

    Output

    $ ./report.awk
    Error Code: B
            Issue:    1212  Count: 3
    Error Code: X
            Issue:    2211  Count: 1
            Issue:    1143  Count: 2
    Error Code: Y
            Issue:    2961  Count: 1
            Issue:    6666  Count: 1
            Issue:    5555  Count: 2
            Issue:    5911  Count: 1
            Issue:    4949  Count: 1
    Error Code: Z
            Issue:    2222  Count: 1
            Issue:    1111  Count: 1
            Issue:    2323  Count: 2
            Issue:    3333  Count: 1
            Issue:    1212  Count: 1
    
    0 讨论(0)
  • 2021-01-22 09:17

    As suggested by Dave Jarvis, awk will:

    • handle this better than bash
    • is fairly easy to learn
    • likely available wherever bash is available

    I've never had to look farther than The AWK Manual.

    It would make things easier if you used a consistent field separator for both the list of column names and the data. Perhaps you could do some pre-processing in a bash script using sed before feeding to awk. Anyway, take a look at multi-dimensional arrays and reading multiple lines in the manual.

    0 讨论(0)
提交回复
热议问题