问题
I have a file (filename.txt
) with the following structure:
>line1
ABC
>line2
DEF
>line3
GHI
>line4
JKL
I would like to shuffle the characters in the strings that do not start wit >
. The output would (for example) look like the following:
>line1
BCA
>line2
DFE
>line3
IHG
>line4
KLJ
This is what I tried to shuffle the characters in a string: sed 's/./&\n/' | shuf | tr -d "\n"
. It looks like it works but it does not take into account newlines. Moreover it executes the command on all data and not only on lines that do not start with >
.
回答1:
With perl
and ruby
$ # split// to get individual characters
$ # join "" to join characters with empty string
$ # if !/^>/ to apply the change only for lines not starting with >
$ # alternate: perl -MList::Util=shuffle -lne 'print /^>/ ? $_ : shuffle split//'
$ perl -MList::Util=shuffle -lpe '$_=join "", shuffle split// if !/^>/' ip.txt
>line1
CBA
>line2
FED
>line3
IHG
>line4
JKL
$ # $_.chars to get individual characters
$ # * "" to join array elements with empty string
$ ruby -lpe '$_ = $_.chars.shuffle * "" if !/^>/' ip.txt
>line1
BAC
>line2
EDF
>line3
GHI
>line4
JKL
回答2:
awk
+ coreutils
approach:
awk '/^[^>]/{ system("echo "$1"| fold -w1 | shuf | tr -d \047\n\047"); print ""; next }1' file
Sample output:
>line1
BAC
>line2
EDF
>line3
HGI
>line4
KLJ
回答3:
For GNU sed
:
$ cat filename.txt
>line1
ABC
>line2
DEF
>line3
GHI
>line4
JKL
$ sed -r "/^[^>]/s/.*/grep -o . <<< & |sort -R |tr -d '\n'/e" filename.txt
>line1
ABC
>line2
FDE
>line3
HGI
>line4
LKJ
$ sed -r "/^[^>]/s/.*/grep -o . <<< & |shuf |tr -d '\n'/e" filename.txt
>line1
BCA
>line2
FDE
>line3
HIG
>line4
JKL
Edit: sed
works all the same on all (GNU sed) 4.2.2
, we can print the raw command string generated by sed by removing the e
modifier:
sed -r '/^[^>]/s/.*/grep -o . <<< & |shuf |tr -d "\n"/' filename.txt
>line1
grep -o . <<< ABC |shuf |tr -d "
"
>line2
grep -o . <<< DEF |shuf |tr -d "
"
>line3
grep -o . <<< GHI |shuf |tr -d "
"
>line4
grep -o . <<< JKL |shuf |tr -d "
"
Then, the e
modifier of s
command of sed
will call sh
to execute it. The sh
on CentOS
is a symbolic link to bash
, but on Ubuntu it is a symbolic link to dash
, and dash
maybe not support <<<
(here-string
).
# on Ubuntu, enter into sh terminal:
$ grep -o . <<< JKL |shuf |tr -d '\n'
sh: 2: Syntax error: redirection unexpected
$ echo JKL |grep -o . |shuf |tr -d '\n'
KLJ
So, I need to modify my answer to work for both bash
and dash
:
$ sed -r '/^[^>]/s/.*/echo -n & |grep -o . |shuf |tr -d "\n"/e' filename.txt
>line1
ACB
>line2
DFE
>line3
IHG
>line4
LJK
Simple explanations:
/^[^>]/
: forcesed
to deal with the lines which starts (^
) with NOT a>
([^>]
).s/.*/echo -n & |grep -o . |shuf |tr -d "\n"/
:.*
is the whole line, use&
to hold it in substitute, so&
is the whole origin line, then generate a plain command stringecho -n ORIGIN_LINE |grep -o . |shuf |tr -d "\n"
, which can shuffle a line.- finally, use the
e
modifier ofs
command to execute the plain command string generated above.
回答4:
Here is one in GNU awk:
$ awk -v seed=$RANDOM ' # get some randomness from shell
function cmp_randomize(i1, v1, i2, v2) { # random for traversal function
return (2 - 4 * rand()) # from 12.2.1 Controlling Array Traversal
} # of Gnu awk docs
BEGIN {
srand(seed) # use the seed, Luke
PROCINFO["sorted_in"]="cmp_randomize" # use above defined function
}
/^[^>]/ { # if starts with anything but >
split($0,a,"") # split to hash a
for(i in a) # iterate a in random order
printf "%s", a[i] # output
print "" # newline
next # next record
}1' file # output > starting records
>line1
CAB
>line2
DFE
>line3
GIH
>line4
LKJ
回答5:
This might work for you (GNU sed):
sed '/^>/b;s/./&\n/g;s/.$//;s/.*/echo "&"|shuf/e' file
Print lines beginning with >
but do not process. Otherwise, insert newlines between each character in the current line and remove the last unwanted newline. Then echo
the file produced and pipe it throught the shuf
command (sort -R may be substituted if necessary) and print the result.
N.B. This solution uses the GNU specific e
flag on the substitution command, however the result could be passed to a shell, like so:
sed '/^>/s/.*/echo "&"/;t;s/./&\n/g;s/.$//;s/.*/echo "&"|shuf/' file | sh
来源:https://stackoverflow.com/questions/49555930/bash-shuffle-characters-in-strings-from-file