Shell script to find, search and replace array of strings in a file

后端 未结 2 1297
逝去的感伤
逝去的感伤 2021-01-22 13:41

This is linked to another question/code-golf i asked on Code golf: "Color highlighting" of repeated text

I\'ve got a file \'sample1.txt\' with the following co

相关标签:
2条回答
  • 2021-01-22 13:42

    Straightforward with Perl:

    #! /usr/bin/perl
    
    use warnings;
    use strict;
    
    my @words = qw/
      LoremIpsum
      LoremIpsu
      dummytext
      oremIpsum
      LoremIps
      dummytex
      industry
      oremIpsu
      remIpsum
      ummytext
      LoremIp
      dummyte
      emIpsum
      industr
      mmytext
    /;
    
    my $to_replace = qr/@{[ join "|" =>
                            sort { length $b <=> length $a }
                            @words
                         ]}/;
    
    my $i = 0;
    while (<>) {
      s|($to_replace)|++$i; "<T$i>$1</T$i>"|eg;
      print;
    }
    

    Sample run (wrapped to prevent horizontal scrolling):

    $ ./tag-words sample.txt
    <T1>LoremIpsum</T1>issimply<T2>dummytext</T2>oftheprintingandtypesetting<T3>indus
    try</T3>.<T4>LoremIpsum</T4>hasbeenthe<T5>industry</T5>'sstandard<T6>dummytext</T
    6>eversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatyp
    especimenbook.

    You may object that all the qr// and @{[ ... ]} business is on the baroque side. One could get the same effect with the /o regular-expression switch as in

    # plain scalar rather than a compiled pattern
    my $to_replace = join "|" =>
                     sort { length $b <=> length $a }
                     @words;
    
    my $i = 0;
    while (<>) {
      # o at the end for "compile (o)nce"
      s|($to_replace)|++$i; "<T$i>$1</T$i>"|ego;
      print;
    }
    
    0 讨论(0)
  • 2021-01-22 13:49

    Pure Bash (no externals)

    At the Bash command line:

    $ sample="LoremIpsumissimplydummytextoftheprintingandtypesettingindustry.LoremIpsumhasbeentheindustry'sstandarddummytexteversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook."
    $ # or: sample=$(<sample1.txt)
    $ array=(
    LoremIpsum
    LoremIpsu
    dummytext
    ...
    )
    $ tag=0; for entry in ${array[@]}; do test="<[^>/]*>[^>]*$entry[^<]*</"; if [[ ! $sample =~ $test ]]; then ((tag++)); sample=${sample//${entry}/<T$tag>$entry</T$tag>}; fi; done; echo "Output:"; echo $sample
    Output:
    <T1>LoremIpsum</T1>issimply<T2>dummytext</T2>oftheprintingandtypesetting<T3>industry</T3>.<T1>LoremIpsum</T1>hasbeenthe<T3>industry</T3>'sstandard<T2>dummytext</T2>eversincethe1500s,whenanunknownprintertookagalleyoftypeandscrambledittomakeatypespecimenbook.
    
    0 讨论(0)
提交回复
热议问题