gawk

How to only add unique elements to an array in awk from sevral input text files

巧了我就是萌 提交于 2021-01-29 07:32:43
问题 as the toppic suggests, how to I read in information from multiple text files and only add elements 1 time in a an array regardless if they occur multiple times in the diffrent text files? I have started with this script that reads in and prints out all elements in the order that they occur in the different documents. For example take e look at these 3 diffrent text files containing the following data File 1: 2011-01-22 22:12 test1 22 1312 75 13.55 1399 2011-01-23 22:13 test4 22 1112 72 12.55

GNU awk, FPAT and trouble with a duplicating FS

妖精的绣舞 提交于 2021-01-28 04:45:35
问题 I have a file : $ cat file 1,,"3.1,3.2",4,5 and because of the quotes I'm using FPAT = "([^,]*)|(\"[^\"]+\")") instead of just FS="," . I'm trying to replace a field, lets say $4 , with another value: $ gawk 'BEGIN{FPAT="([^,]*)|(\"[^\"]+\")"; OFS=","}{$4="new"; print}' file 1,,"3.1,3.2",new,,5 $ # right here ^ but I get a duplicated , ( OFS ) after the replaced field. It gets duplicated when modifying any field except last field or empty fields. Are you guys seeing this or is it just me in

awk - concatenate two string variable and assign to a third

孤街醉人 提交于 2020-11-25 05:53:39
问题 In awk, I have 2 fields: $1 and $2. They are both strings that I want to concatenate and assign to a variable. 回答1: Just use var = var1 var2 and it will automatically concatenate the vars var1 and var2 : awk '{new_var=$1$2; print new_var}' file You can put an space in between with: awk '{new_var=$1" "$2; print new_var}' file Which in fact is the same as using FS , because it defaults to the space: awk '{new_var=$1 FS $2; print new_var}' file Test $ cat file hello how are you i am fine $ awk '

GAWK: Inverse of strftime() - Convert date string to seconds since epoc timestamp using format pattern

喜欢而已 提交于 2020-08-24 19:24:13
问题 Gnu AWK provides the built in function strftime() which can convert a timestamp like 1359210984 into Sat 26. Jan 15:36:24 CET 2013 . I couldn't find a function that would do this: seconds = timefromdate("Sat 26. Jan 15:36:24 CET 2013", "%a %d. %b %H:%M:%S CET %Y") or seconds = timefromdate("2013-01-26 15:36:24", "%Y-%m-%d %H:%M:%S") Whereas seconds then is 1359210984 . So, the date string should be convertable by a format pattern. I'd like to do this in gawk only. Edit 1: I'd like to convert

Min-Max Normalization using AWK

雨燕双飞 提交于 2020-08-05 20:01:10
问题 I dont know Why I am unable to loop through all the records. currently it goes for last record and prints the normalization for it. Normalization formula: New_Value = (value - min[i]) / (max[i] - min[i]) Program { for(i = 1; i <= NF; i++) { if (min[i]==""){ min[i]=$i;} #initialise min if (max[i]==""){ max[i]=$i;} #initialise max if ($i<min[i]) { min[i]=$i;} #new min if ($i>max[i]) { max[i]=$i;} #new max } } END { for(j = 1; j <= NF; j++) { normalized_value[j] = ($j - min[j])/(max[j] - min[j])

Using pipe character as a field separator

馋奶兔 提交于 2020-07-14 12:47:11
问题 I'm trying different commands to process csv file where the separator is the pipe | character. While those commands do work when the comma is a separator, it throws an error when I replace it with the pipe: awk -F[|] "NR==FNR{a[$2]=$0;next}$2 in a{ print a[$2] [|] $4 [|] $5 }" OFS=[|] file1.csv file2.csv awk "{print NR "|" $0}" file1.csv I tried, "|" , [|] , /| to no avail. I'm using Gawk on windows. What I'm I missing? 回答1: You tried "|" , [|] and /| . /| does not work because the escape

gawk regex to find any record having characters other then the specified by character class in regex pattern

隐身守侯 提交于 2020-06-02 23:34:49
问题 I have list of email addresses in a text file. I have a pattern having character classes that specifies what characters are allowed in the email addresses. Now from that input file, I want to only search the email addresses that has the characters other than the allowed ones. I am trying to write a gawk for the same, but not able to get it to work properly. Here is the gawk that I am trying: gawk -F "," ' $2!~/[[:alnum:]@\.]]/ { print "has invalid chars" }' emails.csv The problem I am facing