问题
My original input files is a booking transaction list. I am interested in the lines that are in the two sections: a) transactions and b) refunds. These are always at the bottom of the CSVs and structured.
I can skip all lines above section transaction via regex condition /transaction/ {print}.
I would like to add a column with strings "transaction or refunds" depending on the section in the csv. So I know if a cloumn is a transactions or refund. something like
IF ($2 = "transaction" || " " != "refunds"){$7=="transaction"};
IF ($2 = "refunds" || " " != "transaction"){$7=="refunds"}
I share the CSV and script.awk on my gdrive and hope this is acceptable: convoluted transaction list to be structured
transaction date via Details payment fee
28-02-2015 invoice txn1 44.1 0.19
28-02-2015 invoice txn2 27.7 0.19
07-03-2015 invoice txn3 43.1 0.19
09-03-2015 invoice txn4 36.8 0.19
12-03-2015 invoice txn5 26 0.19
13-03-2015 invoice txn6 43.7 0.19
13-03-2015 invoice txn7 25.6 0.19
15-03-2015 creditcard txn8 70.8 0.19
Sum 317.8 1.52
refunds Datum via Details payment 1.52
18-12-2014 invoice txn0 16
Sum 16
My intended outcome is this:
date via Details payment fee type
28-02-2015 invoice txn1 44.1 0.19 transaction
28-02-2015 invoice txn2 27.7 0.19 transaction
07-03-2015 invoice txn3 43.1 0.19 transaction
09-03-2015 invoice txn4 36.8 0.19 transaction
12-03-2015 invoice txn5 26 0.19 transaction
13-03-2015 invoice txn6 43.7 0.19 transaction
13-03-2015 invoice txn7 25.6 0.19 transaction
15-03-2015 creditcard txn8 70.8 0.19 transaction
18-12-2014 invoice txn0 16 refund
My snippet at the moment:
BEGIN {OFS=FS=";"
print {date,payment option,detailspayment,fee,type }
/^transactions/,/^$/{
if ($3=="via) {next};
if ($6=="Sum") {next};
print $2 FS $3 FS $4 FS $5 FS $6 FS $7;
}
回答1:
awk '
NR == 1 {
$1 = ""
print $0, "type"
type = "transaction"
next
}
$1 == "refunds" {
print ""
type = "- refund"
}
/^ / && NF > 3 {
print $0, type
}' input.txt |column -t
Outputs:
date via Details payment fee type
28-02-2015 invoice txn1 44.1 0.19 transaction
28-02-2015 invoice txn2 27.7 0.19 transaction
07-03-2015 invoice txn3 43.1 0.19 transaction
09-03-2015 invoice txn4 36.8 0.19 transaction
12-03-2015 invoice txn5 26 0.19 transaction
13-03-2015 invoice txn6 43.7 0.19 transaction
13-03-2015 invoice txn7 25.6 0.19 transaction
15-03-2015 creditcard txn8 70.8 0.19 transaction
18-12-2014 invoice txn0 16 - refund
I'm running this through column -t
in order to line up the columns, though that removes the added line break before the refund. Another difference is the dash used for the refund's "fee" which is necessary in order for column -t
to work correctly.
In the awk code, if the number of records (line number, NR
) is 1, remove the first item and print the rest plus "type" and then we move on to the next line. If that line starts with "refunds" then we print a blank line and then alter the type to "refund" (since there's no fee, we indicate that with a dash). Finally, if we have leading spaces and the number of fields (NF
) is 4+, we print the line plus the type.
The awk code can be all on one line if you use semicolons between commands inside the actions.
来源:https://stackoverflow.com/questions/42977022/awk-two-regex-conditions-structure-convoluted-complex-transactions-list-csv