Stripping everything but alphanumeric chars from a string in Python

前端未结

关注

 11  1305

不思量自难忘°

What is the best way to strip all non alphanumeric characters from a string, using Python?

The solutions presented in the PHP variant of this question will probably

相关标签:

11条回答

迷失自我

2020-11-22 11:23
```
sent = "".join(e for e in sent if e.isalpha())
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
耶瑟儿～

2020-11-22 11:25
You could try:
```
print ''.join(ch for ch in some_string if ch.isalnum())
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
温柔的废话

2020-11-22 11:26
How about:
```
def ExtractAlphanumeric(InputString):
    from string import ascii_letters, digits
    return "".join([ch for ch in InputString if ch in (ascii_letters + digits)])
```
This works by using list comprehension to produce a list of the characters in InputString if they are present in the combined ascii_letters and digits strings. It then joins the list together into a string.
0 讨论(0)
发布评论:

提交评论
- 加载中...

深忆病人

2020-11-22 11:26

If i understood correctly the easiest way is to use regular expression as it provides you lots of flexibility but the other simple method is to use for loop following is the code with example I also counted the occurrence of word and stored in dictionary..

s = """An... essay is, generally, a piece of writing that gives the author's own 
argument — but the definition is vague, 
overlapping with those of a paper, an article, a pamphlet, and a short story. Essays 
have traditionally been 
sub-classified as formal and informal. Formal essays are characterized by "serious 
purpose, dignity, logical 
organization, length," whereas the informal essay is characterized by "the personal 
element (self-revelation, 
individual tastes and experiences, confidential manner), humor, graceful style, 
rambling structure, unconventionality 
or novelty of theme," etc.[1]"""

d = {}      # creating empty dic      
words = s.split() # spliting string and stroing in list
for word in words:
    new_word = ''
    for c in word:
        if c.isalnum(): # checking if indiviual chr is alphanumeric or not
            new_word = new_word + c
    print(new_word, end=' ')
    # if new_word not in d:
    #     d[new_word] = 1
    # else:
    #     d[new_word] = d[new_word] +1
print(d)

please rate this if this answer is useful!

0 讨论(0)

囚心锁ツ

2020-11-22 11:27

I just timed some functions out of curiosity. In these tests I'm removing non-alphanumeric characters from the string string.printable (part of the built-in string module). The use of compiled '[\W_]+' and pattern.sub('', str) was found to be fastest.

$ python -m timeit -s \
     "import string" \
     "''.join(ch for ch in string.printable if ch.isalnum())" 
10000 loops, best of 3: 57.6 usec per loop

$ python -m timeit -s \
    "import string" \
    "filter(str.isalnum, string.printable)"                 
10000 loops, best of 3: 37.9 usec per loop

$ python -m timeit -s \
    "import re, string" \
    "re.sub('[\W_]', '', string.printable)"
10000 loops, best of 3: 27.5 usec per loop

$ python -m timeit -s \
    "import re, string" \
    "re.sub('[\W_]+', '', string.printable)"                
100000 loops, best of 3: 15 usec per loop

$ python -m timeit -s \
    "import re, string; pattern = re.compile('[\W_]+')" \
    "pattern.sub('', string.printable)" 
100000 loops, best of 3: 11.2 usec per loop

0 讨论(0)

庸人自扰

2020-11-22 11:27

for char in my_string:
    if not char.isalnum():
        my_string = my_string.replace(char,"")

0 讨论(0)

1 2 下一页