How to use sed to replace only the first occurrence in a file?

前端 未结 23 784
别跟我提以往
别跟我提以往 2020-11-22 04:27

I would like to update a large number of C++ source files with an extra include directive before any existing #includes. For this sort of task, I normally use a small bash s

相关标签:
23条回答
  • 2020-11-22 04:45
    sed '0,/pattern/s/pattern/replacement/' filename
    

    this worked for me.

    example

    sed '0,/<Menu>/s/<Menu>/<Menu><Menu>Sub menu<\/Menu>/' try.txt > abc.txt
    

    Editor's note: both work with GNU sed only.

    0 讨论(0)
  • 2020-11-22 04:45

    I finally got this to work in a Bash script used to insert a unique timestamp in each item in an RSS feed:

            sed "1,/====RSSpermalink====/s/====RSSpermalink====/${nowms}/" \
                production-feed2.xml.tmp2 > production-feed2.xml.tmp.$counter
    

    It changes the first occurrence only.

    ${nowms} is the time in milliseconds set by a Perl script, $counter is a counter used for loop control within the script, \ allows the command to be continued on the next line.

    The file is read in and stdout is redirected to a work file.

    The way I understand it, 1,/====RSSpermalink====/ tells sed when to stop by setting a range limitation, and then s/====RSSpermalink====/${nowms}/ is the familiar sed command to replace the first string with the second.

    In my case I put the command in double quotation marks becauase I am using it in a Bash script with variables.

    0 讨论(0)
  • 2020-11-22 04:45

    The following command removes the first occurrence of a string, within a file. It removes the empty line too. It is presented on an xml file, but it would work with any file.

    Useful if you work with xml files and you want to remove a tag. In this example it removes the first occurrence of the "isTag" tag.

    Command:

    sed -e 0,/'<isTag>false<\/isTag>'/{s/'<isTag>false<\/isTag>'//}  -e 's/ *$//' -e  '/^$/d'  source.txt > output.txt
    

    Source file (source.txt)

    <xml>
        <testdata>
            <canUseUpdate>true</canUseUpdate>
            <isTag>false</isTag>
            <moduleLocations>
                <module>esa_jee6</module>
                <isTag>false</isTag>
            </moduleLocations>
            <node>
                <isTag>false</isTag>
            </node>
        </testdata>
    </xml>
    

    Result file (output.txt)

    <xml>
        <testdata>
            <canUseUpdate>true</canUseUpdate>
            <moduleLocations>
                <module>esa_jee6</module>
                <isTag>false</isTag>
            </moduleLocations>
            <node>
                <isTag>false</isTag>
            </node>
        </testdata>
    </xml>
    

    ps: it didn't work for me on Solaris SunOS 5.10 (quite old), but it works on Linux 2.6, sed version 4.1.5

    0 讨论(0)
  • 2020-11-22 04:45

    Nothing new but perhaps a little more concrete answer: sed -rn '0,/foo(bar).*/ s%%\1%p'

    Example: xwininfo -name unity-launcher produces output like:

    xwininfo: Window id: 0x2200003 "unity-launcher"
    
      Absolute upper-left X:  -2980
      Absolute upper-left Y:  -198
      Relative upper-left X:  0
      Relative upper-left Y:  0
      Width: 2880
      Height: 98
      Depth: 24
      Visual: 0x21
      Visual Class: TrueColor
      Border width: 0
      Class: InputOutput
      Colormap: 0x20 (installed)
      Bit Gravity State: ForgetGravity
      Window Gravity State: NorthWestGravity
      Backing Store State: NotUseful
      Save Under State: no
      Map State: IsViewable
      Override Redirect State: no
      Corners:  +-2980+-198  -2980+-198  -2980-1900  +-2980-1900
      -geometry 2880x98+-2980+-198
    

    Extracting window ID with xwininfo -name unity-launcher|sed -rn '0,/^xwininfo: Window id: (0x[0-9a-fA-F]+).*/ s%%\1%p' produces:

    0x2200003
    
    0 讨论(0)
  • 2020-11-22 04:46

    An overview of the many helpful existing answers, complemented with explanations:

    The examples here use a simplified use case: replace the word 'foo' with 'bar' in the first matching line only.
    Due to use of ANSI C-quoted strings ($'...') to provide the sample input lines, bash, ksh, or zsh is assumed as the shell.


    GNU sed only:

    Ben Hoffstein's anwswer shows us that GNU provides an extension to the POSIX specification for sed that allows the following 2-address form: 0,/re/ (re represents an arbitrary regular expression here).

    0,/re/ allows the regex to match on the very first line also. In other words: such an address will create a range from the 1st line up to and including the line that matches re - whether re occurs on the 1st line or on any subsequent line.

    • Contrast this with the POSIX-compliant form 1,/re/, which creates a range that matches from the 1st line up to and including the line that matches re on subsequent lines; in other words: this will not detect the first occurrence of an re match if it happens to occur on the 1st line and also prevents the use of shorthand // for reuse of the most recently used regex (see next point).1

    If you combine a 0,/re/ address with an s/.../.../ (substitution) call that uses the same regular expression, your command will effectively only perform the substitution on the first line that matches re.
    sed provides a convenient shortcut for reusing the most recently applied regular expression: an empty delimiter pair, //.

    $ sed '0,/foo/ s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo' 
    1st bar         # only 1st match of 'foo' replaced
    Unrelated
    2nd foo
    3rd foo
    

    A POSIX-features-only sed such as BSD (macOS) sed (will also work with GNU sed):

    Since 0,/re/ cannot be used and the form 1,/re/ will not detect re if it happens to occur on the very first line (see above), special handling for the 1st line is required.

    MikhailVS's answer mentions the technique, put into a concrete example here:

    $ sed -e '1 s/foo/bar/; t' -e '1,// s//bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
    1st bar         # only 1st match of 'foo' replaced
    Unrelated
    2nd foo
    3rd foo
    

    Note:

    • The empty regex // shortcut is employed twice here: once for the endpoint of the range, and once in the s call; in both cases, regex foo is implicitly reused, allowing us not to have to duplicate it, which makes both for shorter and more maintainable code.

    • POSIX sed needs actual newlines after certain functions, such as after the name of a label or even its omission, as is the case with t here; strategically splitting the script into multiple -e options is an alternative to using an actual newlines: end each -e script chunk where a newline would normally need to go.

    1 s/foo/bar/ replaces foo on the 1st line only, if found there. If so, t branches to the end of the script (skips remaining commands on the line). (The t function branches to a label only if the most recent s call performed an actual substitution; in the absence of a label, as is the case here, the end of the script is branched to).

    When that happens, range address 1,//, which normally finds the first occurrence starting from line 2, will not match, and the range will not be processed, because the address is evaluated when the current line is already 2.

    Conversely, if there's no match on the 1st line, 1,// will be entered, and will find the true first match.

    The net effect is the same as with GNU sed's 0,/re/: only the first occurrence is replaced, whether it occurs on the 1st line or any other.


    NON-range approaches

    potong's answer demonstrates loop techniques that bypass the need for a range; since he uses GNU sed syntax, here are the POSIX-compliant equivalents:

    Loop technique 1: On first match, perform the substitution, then enter a loop that simply prints the remaining lines as-is:

    $ sed -e '/foo/ {s//bar/; ' -e ':a' -e '$!{n;ba' -e '};}' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
    1st bar
    Unrelated
    2nd foo
    3rd foo
    

    Loop technique 2, for smallish files only: read the entire input into memory, then perform a single substitution on it.

    $ sed -e ':a' -e '$!{N;ba' -e '}; s/foo/bar/' <<<$'1st foo\nUnrelated\n2nd foo\n3rd foo'
    1st bar
    Unrelated
    2nd foo
    3rd foo
    

    1 1.61803 provides examples of what happens with 1,/re/, with and without a subsequent s//:

    • sed '1,/foo/ s/foo/bar/' <<<$'1foo\n2foo' yields $'1bar\n2bar'; i.e., both lines were updated, because line number 1 matches the 1st line, and regex /foo/ - the end of the range - is then only looked for starting on the next line. Therefore, both lines are selected in this case, and the s/foo/bar/ substitution is performed on both of them.
    • sed '1,/foo/ s//bar/' <<<$'1foo\n2foo\n3foo' fails: with sed: first RE may not be empty (BSD/macOS) and sed: -e expression #1, char 0: no previous regular expression (GNU), because, at the time the 1st line is being processed (due to line number 1 starting the range), no regex has been applied yet, so // doesn't refer to anything.
      With the exception of GNU sed's special 0,/re/ syntax, any range that starts with a line number effectively precludes use of //.
    0 讨论(0)
  • 2020-11-22 04:50

    A possible solution:

        /#include/!{p;d;}
        i\
        #include "newfile.h"
        :a
        n
        ba
    

    Explanation:

    • read lines until we find the #include, print these lines then start new cycle
    • insert the new include line
    • enter a loop that just reads lines (by default sed will also print these lines), we won't get back to the first part of the script from here
    0 讨论(0)
提交回复
热议问题