How to remove leading and trailing whitespaces?

前端未结

关注

 4  1068

I\'m using awk \'{gsub(/^[ \\t]+|[ \\t]+$/,\"\"); print;}\' in.txt > out.txt to remove both leading and trailing whitespaces.

The problem is the output f

相关标签:

4条回答

爱一瞬间的悲伤

2021-02-07 13:45
Your code is OK for me.
You may have something else than space and tabulation...
hexdump -C may help you to check what is wrong:
```
awk '{gsub(/^[ \t]+|[ \t]+$/,""); print;}' in.txt | hexdump -C | less
```
UPDATE:

OK you identified DC4 (there may be some other control characters...)
Then, you can improve your command:
```
awk '{gsub(/^[[:cntrl:][:space:]]+|[[:cntrl:][:space:]]+$/,""); print;}' in.txt > out.txt
```
See awk manpage:

[:alnum:] Alphanumeric characters.
[:alpha:] Alphabetic characters.
[:blank:] Space or tab characters.
[:cntrl:] Control characters.
[:digit:] Numeric characters.
[:graph:] Characters that are both printable and visible. (A space is printable, but not visible, while an a is both.)
[:lower:] Lower-case alphabetic characters.
[:print:] Printable characters (characters that are not control characters.)
[:punct:] Punctuation characters (characters that are not letter, digits, control characters, or space characters).
[:space:] Space characters (such as space, tab, and formfeed, to name a few).
[:upper:] Upper-case alphabetic characters.
[:xdigit:] Characters that are hexadecimal digits.

Leading/Trailing 0x20 removal

For me the command is OK, I have tested like this:
```
$ echo -e "\x20 \tTEXT\x20 \t" | hexdump -C
00000000  20 20 09 54 45 58 54 20  20 09 0a                 |  .TEXT  ..|
0000000b
$ echo -e "\x20 \tTEXT\x20 \t" | awk '{gsub(/^[[:cntrl:][:space:]]+|[[:cntrl:][:space:]]+$/,""); print;}' | hexdump -C
00000000  54 45 58 54 0a                                    |TEXT.|
00000005
```
However if you have 0x20 in the middle of your text
=> then it is not removed.
But this is not your question, isn't it?
0 讨论(0)
发布评论:

提交评论
- 加载中...
别那么骄傲

2021-02-07 14:00
Perl could be used:
```
perl -lpe 's/^\s*(.*\S)\s*$/$1/' in.txt > out.txt
```
s/foo/bar/ substitute using regular expressions
^ beginning of string
\s* zero or more spaces
(.*\S) any characters ending with a non-whitespace. Capture it into $1
\s* zero or more spaces
$ end of string
0 讨论(0)
发布评论:

提交评论
- 加载中...
天涯浪人

2021-02-07 14:09

Your files probably have Windows line endings. That means that they end with \r\n, so matching a sequence of tabs and spaces at the end of the line won't work -- awk tries to match all the tabs and spaces that come after the \r. Try running the file through tr -d "\r" before sending it to awk.

0 讨论(0)
发布评论:

提交评论
- 加载中...
渐次进展

2021-02-07 14:10
This command works for me:
```
$ awk '{$1=$1}1' file.txt
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

How to remove leading and trailing whitespaces?

UPDATE:

Leading/Trailing 0x20 removal

Leading/Trailing `0x20` removal