How to strip all whitespace from string

前端未结

关注

 11  1844

How do I strip all the spaces in a python string? For example, I want a string like strip my spaces to be turned into stripmyspaces, but I cannot s

相关标签:

11条回答

佛祖请我去吃肉

2020-11-28 18:50

As mentioned by Roger Pate following code worked for me:

s = " \t foo \n bar "
"".join(s.split())
'foobar'

I am using Jupyter Notebook to run following code:

i=0
ProductList=[]
while i < len(new_list): 
   temp=''                            # new_list[i]=temp=' Plain   Utthapam  '
   #temp=new_list[i].strip()          #if we want o/p as: 'Plain Utthapam'
   temp="".join(new_list[i].split())  #o/p: 'PlainUtthapam' 
   temp=temp.upper()                  #o/p:'PLAINUTTHAPAM' 
   ProductList.append(temp)
   i=i+2

0 讨论(0)

隐瞒了意图╮

2020-11-28 18:57

The standard techniques to filter a list apply, although they are not as efficient as the split/join or translate methods.

We need a set of whitespaces:

>>> import string
>>> ws = set(string.whitespace)

The filter builtin:

>>> "".join(filter(lambda c: c not in ws, "strip my spaces"))
'stripmyspaces'

A list comprehension (yes, use the brackets: see benchmark below):

>>> import string
>>> "".join([c for c in "strip my spaces" if c not in ws])
'stripmyspaces'

A fold:

>>> import functools
>>> "".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))
'stripmyspaces'

Benchmark:

>>> from timeit import timeit
>>> timeit('"".join("strip my spaces".split())')
0.17734256500003198
>>> timeit('"strip my spaces".translate(ws_dict)', 'import string; ws_dict = {ord(ws):None for ws in string.whitespace}')
0.457635745999994
>>> timeit('re.sub(r"\s+", "", "strip my spaces")', 'import re')
1.017787621000025

>>> SETUP = 'import string, operator, functools, itertools; ws = set(string.whitespace)'
>>> timeit('"".join([c for c in "strip my spaces" if c not in ws])', SETUP)
0.6484303600000203
>>> timeit('"".join(c for c in "strip my spaces" if c not in ws)', SETUP)
0.950212219999969
>>> timeit('"".join(filter(lambda c: c not in ws, "strip my spaces"))', SETUP)
1.3164566040000523
>>> timeit('"".join(functools.reduce(lambda acc, c: acc if c in ws else acc+c, "strip my spaces"))', SETUP)
1.6947649049999995

0 讨论(0)

陌清茗

2020-11-28 19:01
For Python 3:
```
>>> import re
>>> re.sub(r'\s+', '', 'strip my \n\t\r ASCII and \u00A0 \u2003 Unicode spaces')
'stripmyASCIIandUnicodespaces'
>>> # Or, depending on the situation:
>>> re.sub(r'(\s|\u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF)+', '', \
... '\uFEFF\t\t\t strip all \u000A kinds of \u200B whitespace \n')
'stripallkindsofwhitespace'
```
...handles any whitespace characters that you're not thinking of - and believe us, there are plenty.

\s on its own always covers the ASCII whitespace:
- (regular) space
- tab
- new line (\n)
- carriage return (\r)
- form feed
- vertical tab
Additionally:
- for Python 2 with re.UNICODE enabled,
- for Python 3 without any extra actions,
...\s also covers the Unicode whitespace characters, for example:
- non-breaking space,
- em space,
- ideographic space,
...etc. See the full list here, under "Unicode characters with White_Space property".

However \s DOES NOT cover characters not classified as whitespace, which are de facto whitespace, such as among others:
- zero-width joiner,
- Mongolian vowel separator,
- zero-width non-breaking space (a.k.a. byte order mark),
...etc. See the full list here, under "Related Unicode characters without White_Space property".

So these 6 characters are covered by the list in the second regex, \u180B|\u200B|\u200C|\u200D|\u2060|\uFEFF.

Sources:
- https://docs.python.org/2/library/re.html
- https://docs.python.org/3/library/re.html
- https://en.wikipedia.org/wiki/Unicode_character_property
0 讨论(0)
发布评论:

提交评论
- 加载中...
青春惊慌失措

2020-11-28 19:03

Try a regex with re.sub. You can search for all whitespace and replace with an empty string.

\s in your pattern will match whitespace characters - and not just a space (tabs, newlines, etc). You can read more about it in the manual.

0 讨论(0)
发布评论:

提交评论
- 加载中...
误落风尘

2020-11-28 19:04
The simplest is to use replace:
```
"foo bar\t".replace(" ", "").replace("\t", "")
```
Alternatively, use a regular expression:
```
import re
re.sub(r"\s", "", "foo bar\t")
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

-上瘾入骨i

2020-11-28 19:04

Remove the Starting Spaces in Python

string1="    This is Test String to strip leading space"
print string1
print string1.lstrip()

Remove the Trailing or End Spaces in Python

string2="This is Test String to strip trailing space     "
print string2
print string2.rstrip()

Remove the whiteSpaces from Beginning and end of the string in Python

string3="    This is Test String to strip leading and trailing space      "
print string3
print string3.strip()

Remove all the spaces in python

string4="   This is Test String to test all the spaces        "
print string4
print string4.replace(" ", "")

0 讨论(0)

1 2 下一页