awk two regex conditions - structure convoluted complex transactions list csv

后端 未结 1 489
北恋
北恋 2021-01-23 22:34

My original input files is a booking transaction list. I am interested in the lines that are in the two sections: a) transactions and b) refunds. These are always at the bottom

相关标签:
1条回答
  • 2021-01-23 22:53
    awk '
      NR == 1 {
        $1 = ""
        print $0, "type"
        type = "transaction"
        next
      }
      $1 == "refunds" {
        print ""
        type = "- refund"
      }
      /^ / && NF > 3 {
        print $0, type
      }' input.txt |column -t
    

    Outputs:

    date        via         Details  payment  fee   type
    28-02-2015  invoice     txn1     44.1     0.19  transaction
    28-02-2015  invoice     txn2     27.7     0.19  transaction
    07-03-2015  invoice     txn3     43.1     0.19  transaction
    09-03-2015  invoice     txn4     36.8     0.19  transaction
    12-03-2015  invoice     txn5     26       0.19  transaction
    13-03-2015  invoice     txn6     43.7     0.19  transaction
    13-03-2015  invoice     txn7     25.6     0.19  transaction
    15-03-2015  creditcard  txn8     70.8     0.19  transaction
    18-12-2014  invoice     txn0     16       -     refund
    

    I'm running this through column -t in order to line up the columns, though that removes the added line break before the refund. Another difference is the dash used for the refund's "fee" which is necessary in order for column -t to work correctly.

    In the awk code, if the number of records (line number, NR) is 1, remove the first item and print the rest plus "type" and then we move on to the next line. If that line starts with "refunds" then we print a blank line and then alter the type to "refund" (since there's no fee, we indicate that with a dash). Finally, if we have leading spaces and the number of fields (NF) is 4+, we print the line plus the type.

    The awk code can be all on one line if you use semicolons between commands inside the actions.

    0 讨论(0)
提交回复
热议问题