I would like to remove any single letter from a string in python.
For example:
input: \'z 23rwqw a 34qf34 h 343fsdfd\'
output: \'23rwqw 34qf34 343fsd
try this one;
(?<![\w])(?:[a-zA-Z0-9](?: |$))
I hope there's a neater regex way than this, but:
>>> import re
>>> text = 'z 23rwqw a 34qf34 h 343fsdfd'
>>> re.sub('(\\b[A-Za-z] \\b|\\b [A-Za-z]\\b)', '', text)
'23rwqw 34qf34 343fsdfd'
It's a word boundary, a single letter, a space, and a word boundary.
It's doubled up so it can match a single character at the start or end of the string z_
and _z
leaving no space, and a character in the middle _z_
leaving one space.
import re
text = "z 23rwqw a 34qf34 h 343fsdfd"
print re.sub(r'(?:^| )\w(?:$| )', ' ', text).strip()
or
tmp = re.sub(r'\b\w\b', ' ', input)
print re.sub(r'\s{2,}', ' ', tmp).strip()
>>> ' '.join( [w for w in input.split() if len(w)>1] )
'23rwqw 34qf34 343fsdfd'
I had a similar issue and came up with the following regex solution:
import re
pattern = r"((?<=^)|(?<= )).((?=$)|(?= ))"
text = "z 23rwqw a 34qf34 h 343fsdfd"
print(re.sub("\s+", " ", re.sub(pattern, '', text).strip()))
#23rwqw 34qf34 343fsdfd
Explanation
(?<=^)
and (?<= )
are look-behinds for start of string and space, respectively. Match either of these conditions using |
(or)..
matches any single character((?=$)|(?= ))
is similar to the first bullet point, except it's a look-ahead for either the end of the string or a space.Finally call re.sub("\s+", " ", my_string)
to condense multiple spaces with a single space.