BASH - Shuffle characters in strings from file

只谈情不闲聊 提交于 2020-01-04 04:11:11

问题


I have a file (filename.txt) with the following structure:

>line1
ABC
>line2
DEF
>line3
GHI
>line4
JKL

I would like to shuffle the characters in the strings that do not start wit >. The output would (for example) look like the following:

>line1
BCA
>line2
DFE
>line3
IHG
>line4
KLJ

This is what I tried to shuffle the characters in a string: sed 's/./&\n/' | shuf | tr -d "\n" . It looks like it works but it does not take into account newlines. Moreover it executes the command on all data and not only on lines that do not start with >.


回答1:


With perl and ruby

$ # split// to get individual characters
$ # join "" to join characters with empty string
$ # if !/^>/ to apply the change only for lines not starting with >
$ # alternate: perl -MList::Util=shuffle -lne 'print /^>/ ? $_ : shuffle split//'
$ perl -MList::Util=shuffle -lpe '$_=join "", shuffle split// if !/^>/' ip.txt 
>line1
CBA
>line2
FED
>line3
IHG
>line4
JKL

$ # $_.chars to get individual characters
$ # * "" to join array elements with empty string
$ ruby -lpe '$_ = $_.chars.shuffle * "" if !/^>/' ip.txt 
>line1
BAC
>line2
EDF
>line3
GHI
>line4
JKL



回答2:


awk + coreutils approach:

awk '/^[^>]/{ system("echo "$1"| fold -w1 | shuf | tr -d \047\n\047"); print ""; next }1' file

Sample output:

>line1
BAC
>line2
EDF
>line3
HGI
>line4
KLJ



回答3:


For GNU sed:

$ cat filename.txt
>line1
ABC
>line2
DEF
>line3
GHI
>line4
JKL
$ sed -r "/^[^>]/s/.*/grep -o . <<< & |sort -R |tr -d '\n'/e" filename.txt
>line1
ABC
>line2
FDE
>line3
HGI
>line4
LKJ
$ sed -r "/^[^>]/s/.*/grep -o . <<< & |shuf |tr -d '\n'/e" filename.txt
>line1
BCA
>line2
FDE
>line3
HIG
>line4
JKL

Edit: sed works all the same on all (GNU sed) 4.2.2, we can print the raw command string generated by sed by removing the e modifier:

sed -r '/^[^>]/s/.*/grep -o . <<< & |shuf |tr -d "\n"/' filename.txt
>line1
grep -o . <<< ABC |shuf |tr -d "
"
>line2
grep -o . <<< DEF |shuf |tr -d "
"
>line3
grep -o . <<< GHI |shuf |tr -d "
"
>line4
grep -o . <<< JKL |shuf |tr -d "
"

Then, the e modifier of s command of sed will call sh to execute it. The sh on CentOS is a symbolic link to bash, but on Ubuntu it is a symbolic link to dash, and dash maybe not support <<< (here-string).

# on Ubuntu, enter into sh terminal:
$ grep -o . <<< JKL |shuf |tr -d '\n'
sh: 2: Syntax error: redirection unexpected
$ echo JKL |grep -o . |shuf |tr -d '\n'
KLJ

So, I need to modify my answer to work for both bash and dash:

$ sed -r '/^[^>]/s/.*/echo -n & |grep -o . |shuf |tr -d "\n"/e' filename.txt
>line1
ACB
>line2
DFE
>line3
IHG
>line4
LJK

Simple explanations:

  1. /^[^>]/: force sed to deal with the lines which starts (^) with NOT a > ([^>]).
  2. s/.*/echo -n & |grep -o . |shuf |tr -d "\n"/: .* is the whole line, use & to hold it in substitute, so & is the whole origin line, then generate a plain command string echo -n ORIGIN_LINE |grep -o . |shuf |tr -d "\n", which can shuffle a line.
  3. finally, use the e modifier of s command to execute the plain command string generated above.



回答4:


Here is one in GNU awk:

$ awk -v seed=$RANDOM '                   # get some randomness from shell
function cmp_randomize(i1, v1, i2, v2) {  # random for traversal function
    return (2 - 4 * rand())               # from 12.2.1 Controlling Array Traversal
}                                         # of Gnu awk docs
BEGIN {
    srand(seed)                           # use the seed, Luke
    PROCINFO["sorted_in"]="cmp_randomize" # use above defined function
}
/^[^>]/ {                                 # if starts with anything but >
    split($0,a,"")                        # split to hash a
    for(i in a)                           # iterate a in random order
        printf "%s", a[i]                 # output
    print ""                              # newline
    next                                  # next record
}1' file                                  # output > starting records
>line1
CAB
>line2
DFE
>line3
GIH
>line4
LKJ



回答5:


This might work for you (GNU sed):

sed '/^>/b;s/./&\n/g;s/.$//;s/.*/echo "&"|shuf/e' file

Print lines beginning with > but do not process. Otherwise, insert newlines between each character in the current line and remove the last unwanted newline. Then echo the file produced and pipe it throught the shuf command (sort -R may be substituted if necessary) and print the result.

N.B. This solution uses the GNU specific e flag on the substitution command, however the result could be passed to a shell, like so:

sed '/^>/s/.*/echo "&"/;t;s/./&\n/g;s/.$//;s/.*/echo "&"|shuf/' file | sh


来源:https://stackoverflow.com/questions/49555930/bash-shuffle-characters-in-strings-from-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!