How to search & replace arbitrary literal strings in sed and awk (and perl)

放肆的年华 提交于 2019-12-20 03:48:06

问题


Say we have some arbitrary literals in a file that we need to replace with some other literal.

Normally, we'd just reach for sed(1) or awk(1) and code something like:

sed "s/$target/$replacement/g" file.txt

But what if the $target and/or $replacement could contain characters that are sensitive to sed(1) such as regular expressions. You could escape them but suppose you don't know what they are - they are arbitrary, ok? You'd need to code up something to escape all possible sensitive characters - including the '/' separator. eg

t=$( echo "$target" | sed 's/\./\\./g; s/\*/\\*/g; s/\[/\\[/g; ...' ) # arghhh!

That's pretty awkward for such a simple problem.

perl(1) has \Q ... \E quotes but even that can't cope with the '/' separator in $target.

perl -pe "s/\Q$target\E/$replacement/g" file.txt

I just posted an answer!! So my real question is, "is there a better way to do literal replacements in sed/awk/perl?"

If not, I'll leave this here in case it comes in useful.


回答1:


The quotemeta, which implements \Q, absolutely does what you ask for

all ASCII characters not matching /[A-Za-z_0-9]/ will be preceded by a backslash

Since this is presumably in a shell script, the problem really is of how and when shell variables get interpolated and what the Perl program ends up seeing.

The best way is to avoid working out that interpolation mess and instead properly pass those shell variables to the Perl one-liner. This can be done in several ways; see this post for details.

Either pass the shell variables simply as arguments

#!/bin/bash

# define $target

perl -pe"BEGIN { $patt = shift }; s{\Q$patt}{$replacement}g" "$target" file.txt

where the needed arguments are removed from @ARGV and utilized in a BEGIN block, so before the runtime; then file.txt gets processed. There is no need for \E in the regex here.

Or, use the -s switch, which enables command-line switches for the program

# define $target, etc

perl -s -pe"s{\Q$patt}{$replacement}g" -- -patt="$target" file.txt

The -- is needed to mark the start of arguments, and switches must come before filenames.

Finally, you can also export the shell variables, which can then be used in the Perl script via %ENV; but in general I'd rather recommend either of the above two approaches.


A full example

#!/bin/bash
# Last modified: 2019 Jan 06 (22:15)

target="/{"
replacement="&"

echo "Replace $target with $replacement"

perl -wE'
    BEGIN { $p = shift; $r = shift }; 
    $_=q(ah/{yes); s/\Q$p/$r/; say
' "$target" "$replacement"

This prints

Replace /{ with &
ah&yes

where I've used characters mentioned in a comment.

The other way

#!/bin/bash
# Last modified: 2019 Jan 06 (22:05)

target="/{"
replacement="&"

echo "Replace $target with $replacement"

perl -s -wE'$_ = q(ah/{yes); s/\Q$patt/$repl/; say' \
    -- -patt="$target" -repl="$replacement"

where code is broken over lines for readability here (and thus needs the \). Same printout.




回答2:


With awk you could do it like this:

awk -v t="$target" -v r="$replacement" '{gsub(t,r)}' file

The above expects t to be a regular expression, to use it a string you can use

awk -v t="$target" -v r="$replacement" '{while(i=index($0,t)){$0 = substr($0,1,i-1) r substr($0,i+length(t))} print}' file

Inspired from this post

Note that this won't work properly if the replacement string contains the target. The above link has solutions for that too.




回答3:


Me again!

Here's a simpler way using xxd(1):

t=$( echo -n "$target" | xxd -p | tr -d '\n')
r=$( echo -n "$replacement" | xxd -p | tr -d '\n')
xxd -p file.txt | sed "s/$t/$r/g" | xxd -p -r

... so we're hex-encoding the original text with xxd(1) and doing search-replacement using hex-encoded search strings. Finally we hex-decode the result.

EDIT: I forgot to remove \n from the xxd output (| tr -d '\n') so that patterns can span the 60-column output of xxd. Of course, this relies on GNU sed's ability to operate on very long lines (limited only by memory).

EDIT: this also works on multi-line targets eg

target=$'foo\nbar' replacement=$'bar\nfoo'



来源:https://stackoverflow.com/questions/54059656/how-to-search-replace-arbitrary-literal-strings-in-sed-and-awk-and-perl

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!