Hi I have 2 csv\'s in the following format, (basically a list of email and the number of times we have been emailed by that sender):
file1.csv
Email,Val
Pretty straight-forward with Awk
!
awk 'BEGIN{FS=OFS=","; printf "Name,Value1,Value2\n"}NR >1 && FNR==NR{map[$1]=$2; next}$1 in map{$(NF+1)=map[$1]; print}' file2 file1
produces
Name,Value1,Value2
email1@email.com,2,3
email2@email.com,4,6
email3@email.com,1,8
email4@email.com,6,2
Set input and output field-separator to ,
in the BEGIN
clause that gets executed before the input lines are processed and also the final header information needed. The part FNR==NR
is run for the first file in order file2
in this case, create a hash-map, with an index set to the $1
and value set to $2
then on file1
for those lines whose hashed index value belongs in $1
create a new field $(NF+1)
meaning the last field + 1
to the new value and print the result formed.
if you want to keep the order
awk
to the rescue!
$ awk 'BEGIN {FS=OFS=","}
NR==FNR {a[$1]=$2; next}
FNR==1 {print $1,$2"1",a[$1]"2"; next}
{print $1,$2,a[$1]}' file2 file1
Email,Value1,Value2
email1@email.com,2,3
email2@email.com,4,6
email3@email.com,1,8
email4@email.com,6,2
note the order of files...
build a loop running through each line from the first file.
in that loop, build another loop comparing each line of the second file to the current line of the first file.
write matches to your new file.
using join
program
join -t, -o0,1.2,2.2 -a1 -a2 <(sort <file1.csv) <(sort <file2.csv)
otherwise if files are already sorted and contain the same entries with bash builtins
while
IFS=, read -u3 em1 val1
IFS=, read -u4 em2 val2
[[ -n $em1 ]] && [[ -n $em2 ]]
do
if [[ $em1 = $em2 ]]; then
echo "$em1,$val1,$val2"
else
echo "ERROR: $em1 <> $em2"
fi
done 3<file1.csv 4<file2.csv