How can I randomize the lines in a file using standard tools on Red Hat Linux?

前端 未结 11 982
醉酒成梦
醉酒成梦 2020-11-28 05:25

How can I randomize the lines in a file using standard tools on Red Hat Linux?

I don\'t have the shuf command, so I am looking for something like a

相关标签:
11条回答
  • 2020-11-28 05:27

    A one-liner for python:

    python -c "import random, sys; lines = open(sys.argv[1]).readlines(); random.shuffle(lines); print ''.join(lines)," myFile
    

    And for printing just a single random line:

    python -c "import random, sys; print random.choice(open(sys.argv[1]).readlines())," myFile
    

    But see this post for the drawbacks of python's random.shuffle(). It won't work well with many (more than 2080) elements.

    0 讨论(0)
  • 2020-11-28 05:28

    On OSX, grabbing latest from http://ftp.gnu.org/gnu/coreutils/ and something like

    ./configure make sudo make install

    ...should give you /usr/local/bin/sort --random-sort

    without messing up /usr/bin/sort

    0 讨论(0)
  • 2020-11-28 05:32

    shuf is the best way.

    sort -R is painfully slow. I just tried to sort 5GB file. I gave up after 2.5 hours. Then shuf sorted it in a minute.

    0 讨论(0)
  • 2020-11-28 05:32
    cat yourfile.txt | while IFS= read -r f; do printf "%05d %s\n" "$RANDOM" "$f"; done | sort -n | cut -c7-
    

    Read the file, prepend every line with a random number, sort the file on those random prefixes, cut the prefixes afterwards. One-liner which should work in any semi-modern shell.

    EDIT: incorporated Richard Hansen's remarks.

    0 讨论(0)
  • 2020-11-28 05:33

    Mac OS X with DarwinPorts:

    sudo port install unsort
    cat $file | unsort | ...
    
    0 讨论(0)
  • 2020-11-28 05:37

    And a Perl one-liner you get!

    perl -MList::Util -e 'print List::Util::shuffle <>'
    

    It uses a module, but the module is part of the Perl code distribution. If that's not good enough, you may consider rolling your own.

    I tried using this with the -i flag ("edit-in-place") to have it edit the file. The documentation suggests it should work, but it doesn't. It still displays the shuffled file to stdout, but this time it deletes the original. I suggest you don't use it.

    Consider a shell script:

    #!/bin/sh
    
    if [[ $# -eq 0 ]]
    then
      echo "Usage: $0 [file ...]"
      exit 1
    fi
    
    for i in "$@"
    do
      perl -MList::Util -e 'print List::Util::shuffle <>' $i > $i.new
      if [[ `wc -c $i` -eq `wc -c $i.new` ]]
      then
        mv $i.new $i
      else
        echo "Error for file $i!"
      fi
    done
    

    Untested, but hopefully works.

    0 讨论(0)
提交回复
热议问题