gawk | 易学教程

How to handle 3 files with awk?

阅读更多关于 How to handle 3 files with awk?

Ok, so after spending 2 days, I am not able solve it and I am almost out of time now. It might be a very silly question, so please bear with me. My awk script does something like this: BEGIN{ n=50; i=n; } FNR==NR { # Read file-1, which has just 1 column ids[$1]=int(i++/n); next } { # Read file-2 which has 4 columns # Do something next } END {...} It works fine. But now I want to extend it to read 3 files. Let's say, instead of hard-coding the value of "n", I need to read a properties file and set value of "n" from that. I found this question and have tried something like this: BEGIN{ n=0; i=0;

how to use sed, awk, or gawk to print only what is matched?

阅读更多关于 how to use sed, awk, or gawk to print only what is matched?

I see lots of examples and man pages on how to do things like search-and-replace using sed, awk, or gawk. But in my case, I have a regular expression that I want to run against a text file to extract a specific value. I don't want to do search-and-replace. This is being called from bash. Let's use an example: Example regular expression: .*abc([0-9]+)xyz.* Example input file: a b c abc12345xyz a b c As simple as this sounds, I cannot figure out how to call sed/awk/gawk correctly. What I was hoping to do, is from within my bash script have: myvalue=$( sed <...something...> input.txt ) Things I

Calculate date difference between $2,$3 from file in awk

阅读更多关于 Calculate date difference between $2,$3 from file in awk

问题 I would need your help. File with only date, file.txt P1,2013/jul/9,2013/jul/14 P2,2013/jul/14,2013/jul/6 P3,2013/jul/7,2013/jul/5 display output like this P1,2013/jul/9,2013/jul/14,5days P2,2013/jul/14,2013/jul/6,8days P3,2013/jul/7,2013/jul/5,2days 回答1: awk ' BEGIN { months = "jan feb mar apr may jun jul aug sep oct nov dec" OFS = FS = "," } function date2time(date, a,mon) { split(date, a, "/") mon = 1 + (index(months, a[2])-1)/4 return mktime(a[1] " " mon " " a[3] " 0 0 0") } function abs

AWK - Is it possible to Breakdown a log file by a distinct field && by hour

阅读更多关于 AWK - Is it possible to Breakdown a log file by a distinct field && by hour

Question I am trying to find out if it is possible with awk alone to pass in a log file and then have awk output a distinct message with a breakdown of the hour (00-23) as well as a count, for that particular hour vs distinct message. Example Output requested Message1 00 13 01 30 ... 23 6 Message2 00 50 01 10 ... 23 120 etc, etc The input file would look a little something like the following: blah,blah 2016-06-24 00:30:54 blah Message1 7 rand rand2 2016-06-24 00:40:12 blah Message2 35 rand rand2 2016-06-24 00:42:15 blah Message2 12 rand rand2 2016-06-24 00:58:01 blah Message1 5 rand rand2 2016

awk group by multiple columns and print max value with non-primary key

阅读更多关于 awk group by multiple columns and print max value with non-primary key

问题 i'm new to this site and trying to learn awk. i'm trying to find the maximum value of field3, grouping by field1 and print all the fields with maximum value. Field 2 contains time, that means for each item1 there is 96 values of field2,field3 and field4 input file: (comma separated) item1,00:15,10,30 item2,00:45,20,45 item2,12:15,30,45 item1,00:30,20,56 item3,23:00,40,44 item1,12:45,50,55 item3,11:15,30,45 desired output: item1,12:45,50,55 item2,12:15,30,45 item3,11:15,30,45 what i tried so

awk group by multiple columns and print max value with non-primary key

阅读更多关于 awk group by multiple columns and print max value with non-primary key

i'm new to this site and trying to learn awk. i'm trying to find the maximum value of field3, grouping by field1 and print all the fields with maximum value. Field 2 contains time, that means for each item1 there is 96 values of field2,field3 and field4 input file: (comma separated) item1,00:15,10,30 item2,00:45,20,45 item2,12:15,30,45 item1,00:30,20,56 item3,23:00,40,44 item1,12:45,50,55 item3,11:15,30,45 desired output: item1,12:45,50,55 item2,12:15,30,45 item3,11:15,30,45 what i tried so far: BEGIN{ FS=OFS=","} { if (a[$1]<$3){ a[$1]=$3} } END{ for (i in a ){ print i,a[i] } but this only

Quantifiers in a regular expression used with awk behave unexpected

阅读更多关于 Quantifiers in a regular expression used with awk behave unexpected

I want to process this list: (Of course this is just an excerpt.) 1 S3 -> PC-8-Set 2 S3 -> PC-850-Set 3 S3 -> ANSI-Set 4 S3 -> 7-Bit-NRC 5 PC-8-Set -> S3 6 PC-850-Set -> S3 7 ANSI-Set -> S3 This is what I did: awk -F '[[:blank:]]+' '{printf ("%s ", $2)}' list This is what I got: 1 2 3 4 5 6 7 Now I thought the quantifier + is equivalent to {1,} , but when I changed the line to awk -F '[[:blank:]]{1,}' '{printf ("%s ", $2)}' list I got just blanks and the whole line was read to $1. Can someone explain this behaviour please? I'm thankful for every answer! Jotne Try awk --re-interval -F '[[:blank

Bash: Parse CSV with quotes, commas and newlines

阅读更多关于 Bash: Parse CSV with quotes, commas and newlines

问题 Say I have the following csv file: id,message,time 123,"Sorry, This message has commas and newlines",2016-03-28T20:26:39 456,"It makes the problem non-trivial",2016-03-28T20:26:41 I want to write a bash command that will return only the time column. i.e. time 2016-03-28T20:26:39 2016-03-28T20:26:41 What is the most straight forward way to do this? You can assume the availability of standard unix utils such as awk, gawk, cut, grep, etc. Note the presence of "" which escape , and newline

AWK: go through the file twice, doing different tasks

阅读更多关于 AWK: go through the file twice, doing different tasks

I am processing a fairly big collection of Tweets and I'd like to obtain, for each tweet, its mentions (other user's names, prefixed with an @ ), if the mentioned user is also in the file: users = new Dictionary() for each line in file: username = get_username(line) userid = get_userid(line) users.add(key = userid, value = username) for each line in file: mentioned_names = get_mentioned_names(line) mentioned_ids = mentioned_names.map(x => if x in users: users[x] else null) print "$line | $mentioned_ids" I was already processing the file with GAWK, so instead of processing it again in Python or

Is there a way to completely delete fields in awk, so that extra delimiters do not print?

阅读更多关于 Is there a way to completely delete fields in awk, so that extra delimiters do not print?

Consider the following command: gawk -F"\t" "BEGIN{OFS=\"\t\"}{$2=$3=\"\"; print $0}" Input.tsv When I set $2 = $3 = "", the intended effect to get the same effect as writing: print $1,$4,$5...$NF However, what actually happens is that I get two empty fields, with the extra field delimiters still printing. Is it possible to actually delete $2 and $3? Note: If this was on Linux in bash , the correct statement above would be the following, but Windows does not handle single quotes well in cmd.exe . gawk -F'\t' 'BEGIN{OFS="\t"}{$2=$3=""; print $0}' Input.tsv This is an oldie but goodie. As