What is the type of the compiled regular expression in python?
In particular, I want to evaluate
isinstance(re.compile(\'\'), ???)
Python 3.5 introduced the typing module. Included therein is typing.Pattern, a _TypeAlias
.
Starting with Python 3.6, you can simply do:
from typing import Pattern
my_re = re.compile('foo')
assert isinstance(my_re, Pattern)
In 3.5, there used to be a bug requiring you to do this:
assert issubclass(type(my_re), Pattern)
Which isn’t guaranteed to work according to the documentation and test suite.
In 3.7 you can use re.Pattern
:
import re
rr = re.compile("pattern")
isinstance(rr, re.Pattern)
>> True
Prevention is better than cure. Don't create such a heterogeneous list in the first place. Have a set of allowed strings and a list of compiled regex objects. This should make your checking code look better and run faster:
if input in allowed_strings:
ignored = False
else:
for allowed in allowed_regexed_objects:
if allowed.match(input):
ignored = False
break
If you can't avoid the creation of such a list, see if you have the opportunity to examine it once and build the two replacement objects.
Disclaimer: This isn't intended as a direct answer for your specific needs, but rather something that may be useful as an alternative approach
You can keep with the ideals of duck typing, and use hasattr
to determine if the object has certain properties that you want to utilize. For example, you could do something like:
if hasattr(possibly_a_re_object, "match"): # Treat it like it's an re object
possibly_a_re_object.match(thing_to_match_against)
else:
# alternative handler
As an illustration of polymorphism, an alternate solution is to create wrapper classes which implement a common method.
class Stringish (str):
def matches (self, input):
return self == input
class Regexish (re):
def matches (self, input):
return self.match(input)
Now your code can iterate over a list of alloweds
containing objects instantiating either of these two classes completely transparently:
for allowed in alloweds:
if allowed.matches(input):
ignored = False
break
Notice also how some code duplication goes away (though your original code could have been refactored to fix that separately).
This is another not the answer to the question, but it solves the problem response. Unless your_string contains regular expression special characters,
if re.match(your_string,target_string):
has the same effect as
if your_string == target_string:
So drop back one step and use uncompiled regular expression patterns in your list of allowed. This is undoubtedly slower than using compiled regular expressions, but it will work with only the occasional unexpected outcome, and that only if you allow users to supply the allowed items