问题
I'd always appreciate all helps from this website. I would like to relocate strings based on the index number from an index file.
Index numbers are shown on the first column in the index file (index.txt) and I would like to relocate "path" based on index numbers. Paths are placed in the same row if the index number is the same. For example, there are two zeros so path_sparc_ifu_dec_in_3826 is placed on the first row and path_sparc_ifu_dec_in_4349 is placed on the first row and next to path_sparc_ifu_dec_in_3826.
index.txt:
0 path_sparc_ifu_dec_in_3826 str DR - -
0 path_sparc_ifu_dec_in_4349 stf DR - -
1 path_sparc_ifu_dec_in_2374 stf DR - -
1 path_sparc_ifu_dec_in_4011 stf DR - -
2 path_sparc_ifu_dec_in_3078 stf DR - -
However, strings are written in another file (source.txt) and each "path" has four lines of strings.
source.txt:
path_sparc_ifu_dec_in_3826
dtu_inst_d[14]
dec_fcl_rdsr_sel_pc_d
0.8664
path_sparc_ifu_dec_in_4349
dtu_inst_d[18]
dec_swl_rdsr_sel_thr_d
0.795429
path_sparc_ifu_dec_in_2374
dtu_inst_d[13]
dec_dcl_cctype_d[2]
0.938914
path_sparc_ifu_dec_in_4011
dtu_inst_d[13]
ifu_exu_useimm_d
0.843643
path_sparc_ifu_dec_in_3078
dtu_inst_d[12]
ifu_exu_shiftop_d[2]
0.915818
The desired output is:
path_sparc_ifu_dec_in_3826 path_sparc_ifu_dec_in_4349
dtu_inst_d[14] dtu_inst_d[18]
dec_fcl_rdsr_sel_pc_d dec_swl_rdsr_sel_thr_d
0.8664 0.795429
path_sparc_ifu_dec_in_2374 path_sparc_ifu_dec_in_4011
dtu_inst_d[13] dtu_inst_d[13]
dec_dcl_cctype_d[2] ifu_exu_useimm_d
0.938914 0.843643
path_sparc_ifu_dec_in_3078
dtu_inst_d[12]
ifu_exu_shiftop_d[2]
0.915818
My idea is that (1)combining two files first and (2) relocate path info using the index number, but I don't know how to do this work. Probably, sed/awk is an appropriate language.
Any help is appreciated.
Best,
Jaeyoung
回答1:
a one line awk solution could be :
awk -F'\t' 'FNR==NR{ind[$2]=$1;next} { if($1 in ind) { l=4*ind[$1]} else {l=l+1}; text[l]=text[l]"\t"$1 } END { for (i = 0; i < length(text); i++) {print substr(text[i],2)} }' index.txt source.txt
Explanation :
-F'\t'
This is to use tab as separator
FNR==NR
To process file after file
{ind[$2]=$1;next}
Use the first file to create an index
if($1 in ind) { l=4*ind[$1]} else {l=l+1}
"l" is the line number in the output file. If the string is in the index the line number is index*4. If it is not in the index it's the previous line number + 1.
text[l]=text[l]"\t"$1
Add the current string to the correct line.
END { for (i = 0; i < length(text); i++) {print substr(text[i],2)} }
At the end print everything. The subrstr is only here to delete the first useless tab (first char) of each line
My output from your data :
path_sparc_ifu_dec_in_3826 path_sparc_ifu_dec_in_4349
dtu_inst_d[14] dtu_inst_d[18]
dec_fcl_rdsr_sel_pc_d dec_swl_rdsr_sel_thr_d
0.8664 0.795429
path_sparc_ifu_dec_in_2374 path_sparc_ifu_dec_in_4011
dtu_inst_d[13] dtu_inst_d[13]
dec_dcl_cctype_d[2] ifu_exu_useimm_d
0.938914 0.843643
path_sparc_ifu_dec_in_3078
dtu_inst_d[12]
ifu_exu_shiftop_d[2]
0.915818
回答2:
This is another code that works for me.
awk '
NR==FNR {T[$2] = $1
MX = $1
next
}
$1 in T {IX = T[$1]
}
{P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] "\t" $0
}
END {for (i=0; i<=MX; i++) for (j=0; j<4; j++) print P[i, j]
}
' index.txt source.txt
来源:https://stackoverflow.com/questions/37141953/relocation-strings-using-awk-sed-from-a-index-file