Extracting columns from text file with different delimiters in Linux

前端未结

关注

 2  1873

I have very large genotype files that are basically impossible to open in R, so I am trying to extract the rows and columns of interest using linux command line. Rows are st

相关标签:

2条回答

渐次进展

2020-12-05 02:55
You can use cut with a delimiter like this:

with space delim:
```
cut -d " " -f1-100,1000-1005 infile.csv > outfile.csv
```
with tab delim:
```
cut -d$'\t' -f1-100,1000-1005 infile.csv > outfile.csv
```
I gave you the version of cut in which you can extract a list of intervals...

Hope it helps!
0 讨论(0)
发布评论:

提交评论
- 加载中...
情深已故

2020-12-05 02:59
If the command should work with both tabs and spaces as the delimiter I would use awk:
```
awk '{print $100,$101,$102,$103,$104,$105}' myfile > outfile
```
As long as you just need to specify 5 fields it is imo ok to just type them, for longer ranges you can use a for loop:
```
awk '{for(i=100;i<=105;i++)print $i}' myfile > outfile
```
If you want to use cut, you need to use the -f option:
```
cut -f100-105 myfile > outfile
```
If the field delimiter is different from TAB you need to specify it using -d:
```
cut -d' ' -f100-105 myfile > outfile
```
Check the man page for more info on the cut command.
0 讨论(0)
发布评论:

提交评论
- 加载中...