for loop in bash simply prints n times the command instead of reiterating

前端 未结 2 488
孤街浪徒
孤街浪徒 2021-01-26 12:50

I have a input.txt file with over 6000 lines.

If a line a has over 10 words then I want it to be split but not at the 10th word but where the first comma character appe

相关标签:
2条回答
  • 2021-01-26 12:57

    I think I see what you're after. There are a few problems with your approach:

    • awk doesn't process files in-place. So your sub() makes a change, 1 prints to stdout, but your input file never changes.
    • When you sub(), you don't insert a new record into the input stream that awk is processing. Your command merely adds a newline to the current record.

    Given these, you could get away with processing the input multiple times, as you've suggested. But rather than arbitrarily assuming that you'll have a maximum of seven 10-word phrases on a line, it might be better to actually detect whether you need to continue. Something like this:

    #!/usr/bin/env bash
    
    input=input.txt
    temp=$(mktemp ${input}.XXXX)
    trap "rm -f $temp" 0
    
    while awk '
      BEGIN { retval=1 }
      NF >= 10 && /, / {
        sub(/, /, ","ORS)
        retval=0
      }
      1
      END { exit retval }
    ' "$input" > "$temp"; do
      mv -v $temp $input
    done
    

    This uses an exit value from awk to determine whether we need to run another iteration of the bash loop. If awk detects that no substitutions were required, then the loop stops.

    0 讨论(0)
  • 2021-01-26 13:03

    OK, so here is how I solved this problem. It's ugly, but it works. Plus I can keep piping more sed commands to add more conditions (like my comment above @ghoti).

    sed -r '/((\w)+[., ]+){10}/s/\./\.\n/' input.txt | sed -r '/((\w)+[., ]+){10}/s/\./\.\n/' | sed -r '/((\w)+[., ]+){10}/s/\./\.\n/' | sed -r '/((\w)+[., ]+){10}/s/\./\.\n/'| sed -r '/((\w)+[., ]+){10}/s/\./\.\n/' | sed -r '/((\w)+[., ]+){10}/s/\./\.\n/' | sed -r '/((\w)+[., ]+){10}/s/\./\.\n/' | tr -s [:space:] > output.txt
    

    Basically, I just piped the same sed command 7 times (in the above sample I'm replacing periods instead of commas, but all the same). Based on what I read on-line, I'm surprised this command does not allow some of recursive/reiteration. Or if someone knows, please feel free to edit.

    0 讨论(0)
提交回复
热议问题