问题
I have a matrix(5800 rows and 350 columns) of numbers. Each cell is either
0 / 0
1 / 1
2 / 2
What is the fastest way to remove all spaces in each cell, to have:
0/0
1/1
2/2
Sed, R, anything that will do it fastest.
回答1:
If you are going for efficiency, you should probably use coreutils tr for such a simple task:
tr -d ' ' < infile
I compared the posted answers against a 300K file, using GNU awk, GNU sed, perl v5.14.2 and GNU coreutils v8.13. The tests were each run 30 times, this is the average:
awk - 1.52s user 0.01s system 99% cpu 1.529 total
sed - 0.89s user 0.00s system 99% cpu 0.900 total
perl - 0.59s user 0.00s system 98% cpu 0.600 total
tr - 0.02s user 0.00s system 90% cpu 0.020 total
All testes were run as above (cmd < infile
) and with the output directed to /dev/null.
回答2:
Using sed:
sed "s/ \/ /\//g" input.txt
It means:
Replace the string
" / "
(/ \/ /
) by one slash (/\/
) and do it globally (/g
).
回答3:
Here's an awk
alternative that does exactly the same thing:
awk '{gsub(" ",""); print}' input.txt > output.txt
Explanations:
awk '{...}'
: invoke awk, then for each line do the stuff enclosed by braces.gsub(" ","");
: replace all space chars (single or multiple in a row) with the empty string.print
: print the entire lineinput.txt
: specifying your input file as argument to awk> output.txt
: redirect output to a file.
回答4:
A perl solution could look like this:
perl -pwe 'tr/ //d' input.txt > output.txt
You can add the -i
switch to do in-place edit.
来源:https://stackoverflow.com/questions/13634494/remove-spaces-from-cells-in-matrix