I have a string that i am want to remove punctuation from.
I started with
sed \'s/[[:punct:]]/ /g\'
But i had problems on HP-UX n
You need to place the brackets early in the expression:
sed 's/[][=+...-]/ /g'
By placing the ']' as the first character immediately after the opening bracket, it is interpreted as a member of the character set rather than a closing bracket. Placing a '[' anywhere inside the brackets makes it a member of the set.
For this particular character set, you also need to deal with -
specially, since you are not trying to build a range of characters between [
and =
. So put the -
at the end of the class.
You can do it manually:
sed 's/[][\/$*.^|@#{}~&()_:;%+"='\'',`><?!-]/ /g'
This remove the 32 punctuation character, the order of some characters is important:
-
should be at the end like this -]
[]
should be like that [][other characters]
'
should be escaped like that '\''
^
like in [^
[.
[=
[:
and end with .]
=]
:]
$]
here you can have explication of why all that http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_03
You can also specify the characters you want to keep [with inversion]:
sed 's/[^a-zA-Z0-9]/ /g'
Can be handled using the regex capture technique too (Eg: here below) :
echo "narrowPeak_SP1[FLAG]" | sed -e 's/\[\([a-zA-Z0-9]*\)\]/_\1/g'
> narrowPeak_SP1_FLAG
\[ : literal match to open square bracket, since [] is a valid regex
\] : literal match to square close bracket
\(...\) : capture group
\1 : represents the capture group within the square brackets
Here is the final code I ended up with
`echo "$string" | sed 's/[^a-zA-Z0-9]/ /g'`
I had to put =
and -
at the very end.