I know how to do this if I iterate through all of the characters in the string but I am looking for a more elegant method.
There are a variety of ways of achieving this goal, some are clearer than others. For each of my examples, 'True' means that the string passed is valid, 'False' means it contains invalid characters.
First of all, there's the naive approach:
import string
allowed = string.letters + string.digits + '_' + '-'
def check_naive(mystring):
return all(c in allowed for c in mystring)
Then there's use of a regular expression, you can do this with re.match(). Note that '-' has to be at the end of the [] otherwise it will be used as a 'range' delimiter. Also note the $ which means 'end of string'. Other answers noted in this question use a special character class, '\w', I always prefer using an explicit character class range using [] because it is easier to understand without having to look up a quick reference guide, and easier to special-case.
import re
CHECK_RE = re.compile('[a-zA-Z0-9_-]+$')
def check_re(mystring):
return CHECK_RE.match(mystring)
Another solution noted that you can do an inverse match with regular expressions, I've included that here now. Note that [^...] inverts the character class because the ^ is used:
CHECK_INV_RE = re.compile('[^a-zA-Z0-9_-]')
def check_inv_re(mystring):
return not CHECK_INV_RE.search(mystring)
You can also do something tricky with the 'set' object. Have a look at this example, which removes from the original string all the characters that are allowed, leaving us with a set containing either a) nothing, or b) the offending characters from the string:
def check_set(mystring):
return not set(mystring) - set(allowed)
A regular expression will do the trick with very little code:
import re
...
if re.match("^[A-Za-z0-9_-]*$", my_little_string):
# do something here
use a regex and see if it matches!
([a-z][A-Z][0-9]\_\-)*
If it were not for the dashes and underscores, the easiest solution would be
my_little_string.isalnum()
(Section 3.6.1 of the Python Library Reference)
pat = re.compile ('[^\w-]')
def onlyallowed(s):
return not pat.search (s)
Regular expression can be very flexible.
import re;
re.fullmatch("^[\w-]+$", target_string) # fullmatch looks also workable for python 3.4
\w
: Only [a-zA-Z0-9_]
So you need to add -
char for justify hyphen char.
+
: Match one or more repetitions of the preceding char. I guess you don't accept blank input. But if you do, change to *
.
^
: Matches the start of the string.
$
: Matches the end of the string.
You need these two special characters since you need to avoid the following case. The unwanted chars like &
here might appear between the matched pattern.
&&&PATTERN&&PATTERN