Swap two columns - awk, sed, python, perl

后端未结

关注

 9  1464

I\'ve got data in a large file (280 columns wide, 7 million lines long!) and I need to swap the first two columns. I think I could do this with some kind of awk for loop, to

相关标签:

9条回答

情歌与酒

2020-12-01 01:39
This might work for you (GNU sed):
```
sed -i 's/^$[^\t]*\t$$[^\t]*\t$/\2\1/' file
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

甜味超标

2020-12-01 01:39

Maybe even with "inlined" Python - as in a Python script within a shell script - but only if you want to do some more scripting with Bash beforehand or afterwards... Otherwise it is unnecessarily complex.

Content of script file process.sh:

#!/bin/bash

# inline Python script
read -r -d '' PYSCR << EOSCR
from __future__ import print_function
import codecs
import sys

encoding = "utf-8"
fn_in = sys.argv[1]
fn_out = sys.argv[2]

# print("Input:", fn_in)
# print("Output:", fn_out)

with codecs.open(fn_in, "r", encoding) as fp_in, \
        codecs.open(fn_out, "w", encoding) as fp_out:
    for line in fp_in:
        # split into two columns and rest
        col1, col2, rest = line.split("\t", 2)
        # swap columns in output
        fp_out.write("{}\t{}\t{}".format(col2, col1, rest))
EOSCR

# ---------------------
# do setup work?
# e. g. list files for processing

# call python script with params
python3 -c "$PYSCR" "$inputfile" "$outputfile"

# do some more processing
# e. g. rename outputfile to inputfile, ...

If you only need to swap the columns for a single file, then you can also just create a single Python script and statically define the filenames. Or just use an answer above.

0 讨论(0)

名媛妹妹

2020-12-01 01:45
Have you tried using the cut command? E.g.
```
cat myhugefile | cut -c10-20,c1-9,c21- > myrearrangedhugefile
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
孤城傲影

2020-12-01 01:45
This is also easy in perl:
```
perl -pe 's/^(\S+)\t(\S+)/$2\t$1/;' file > outputfile
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

别那么骄傲

2020-12-01 01:50

No need to call anything else but your shell:

bash> while read col1 col2 rest; do 
        echo $col2 $col1 $rest
      done <input_file

Test:

bash> echo "first second a c d e f g" | 
      while read col1 col2 rest; do 
        echo $col2 $col1 $rest
      done
second first a b c d e f g

0 讨论(0)

傲寒

2020-12-01 01:56
Try this more relevant to your question :
```
awk '{printf("%s\t%s\n", $2, $1)}' inputfile
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页