I\'m trying to compare column 1 from file1 and column 3 from file 2, if they match then print the first column from file1 and the two first columns from file2.
here
You can use this awk
:
awk -F '[| ]+' -v OFS='\t' 'NR==FNR{a[$4]=$1 OFS $2; next}
$1 in a{print $1, a[$1]}' file2 file1
Cre01.g000100 chromosome_1 99034
Cre01.g000500 chromosome_1 71569
Cre01.g000650 chromosome_1 93952
Your middle attempt of the three is closest, but:
|
.a[$1]
.Your sample output is inconsistent with your desired output (the sample output shows column 1 from file 1 and column 1 from file 2; the desired output is reputedly column 1 from file 1 and columns 1 and 2 from file 2, though this interpretation depends on the interpretation of $3
in file 2 being the name between two pipe symbols).
Citing the question at the time this answer was created:
… compare column 1 from file1 and column 3 from file 2, if they match then print the first column from file1 and the two first columns from file2.
desired output Cre01.g000100 chromosome_1 99034 Cre01.g000500 chromosome_1 71569 Cre01.g000650 chromosome_1 93952
We can observe that if $3
in file 2 is equal to a value from file 1, then it is as easy to print $3
as a saved value.
So, fixing this up:
awk -F'|' 'NR==FNR { a[$1]=1; next } ($3 in a) { print $3, $1 }' file1 file2
The key change is the assignment to a[$1]
(and the -F'|'
); the rest is cosmetic and can be tweaked to suit your requirements (since the question is self-inconsistent, it is hard to give a better answer).