How do I check if a value matches a type in python?

前端 未结 5 1249
半阙折子戏
半阙折子戏 2020-12-15 09:28

Let\'s say I have a python function whose single argument is a non-trivial type:

from typing import List, Dict
ArgType = List[Dict[str, int]]  # this could b         


        
相关标签:
5条回答
  • 2020-12-15 09:51

    If all you want to do is json-parsing, you should just use pydantic.

    But, I encountered the same problem where I wanted to check the type of python objects, so I created a simpler solution than in other answers that handles at least complex types with nested lists and dictionaries.

    I created a gist with this method at https://gist.github.com/ramraj07/f537bf9f80b4133c65dd76c958d4c461

    Some example uses of this method include:

    from typing import List, Dict, Union, Type, Optional
    
    check_type('a', str)
    check_type({'a': 1}, Dict[str, int])
    check_type([{'a': [1.0]}, 'ten'], List[Union[Dict[str, List[float]], str]])
    check_type(None, Optional[str])
    check_type('abc', Optional[str])
    

    Here's the code below for reference:

    import typing
    
    def check_type(obj: typing.Any, type_to_check: typing.Any, _external=True) -> None:
    
        try:
            if not hasattr(type_to_check, "_name"):
                # base-case
                if not isinstance(obj, type_to_check):
                    raise TypeError
                return
            # type_to_check is from typing library
            type_name = type_to_check._name
    
            if type_to_check is typing.Any:
                pass
            elif type_name in ("List", "Tuple"):
                if (type_name == "List" and not isinstance(obj, list)) or (
                    type_name == "Tuple" and not isinstance(obj, tuple)
                ):
                    raise TypeError
    
                element_type = type_to_check.__args__[0]
                for element in obj:
                    check_type(element, element_type, _external=False)
            elif type_name == "Dict":
                if not isinstance(obj, dict):
                    raise TypeError
                if len(type_to_check.__args__) != 2:
                    raise NotImplementedError(
                        "check_type can only accept Dict typing with separate annotations for key and values"
                    )
                key_type, value_type = type_to_check.__args__
                for key, value in obj.items():
                    check_type(key, key_type, _external=False)
                    check_type(value, value_type, _external=False)
            elif type_name is None and type_to_check.__origin__ is typing.Union:
                type_options = type_to_check.__args__
                no_option_matched = True
                for type_option in type_options:
                    try:
                        check_type(obj, type_option, _external=False)
                        no_option_matched = False
                        break
                    except TypeError:
                        pass
                if no_option_matched:
                    raise TypeError
            else:
                raise NotImplementedError(
                    f"check_type method currently does not support checking typing of form '{type_name}'"
                )
    
        except TypeError:
            if _external:
                raise TypeError(
                    f"Object {repr(obj)} is of type {_construct_type_description(obj)} "
                    f"when {type_to_check} was expected"
                )
            raise TypeError()
    
    
    def _construct_type_description(obj) -> str:
        def get_types_in_iterable(iterable) -> str:
            types = {_construct_type_description(element) for element in iterable}
            return types.pop() if len(types) == 1 else f"Union[{','.join(types)}]"
    
        if isinstance(obj, list):
            return f"List[{get_types_in_iterable(obj)}]"
        elif isinstance(obj, dict):
            key_types = get_types_in_iterable(obj.keys())
            val_types = get_types_in_iterable(obj.values())
            return f"Dict[{key_types}, {val_types}]"
        else:
            return type(obj).__name__
    
    0 讨论(0)
  • 2020-12-15 09:53

    You would have to check your nested type structure manually - the type hint's are not enforced.

    Checking like this ist best done using ABC (Abstract Meta Classes) - so users can provide their derived classes that support the same accessing as default dict/lists:

    import collections.abc 
    
    def isCorrectType(data):
        if isinstance(data, collections.abc.Collection): 
            for d in data:
                if isinstance(d,collections.abc.MutableMapping): 
                    for key in d:
                        if isinstance(key,str) and isinstance(d[key],int):
                            pass
                        else:
                            return False
                else: 
                    return False
        else:
            return False
        return True
    

    Output:

    print ( isCorrectType( [ {"a":2} ] ))       # True
    print ( isCorrectType( [ {2:2} ] ))         # False   
    print ( isCorrectType( [ {"a":"a"} ] ))     # False   
    print ( isCorrectType( [ {"a":2},1 ] ))     # False   
    

    Doku:

    • ABC - abstract meta classes

    Related:

    • What is duck typing?

    The other way round would be to follow the "Ask forgiveness not permission" - explain paradigm and simyply use your data in the form you want and try:/except: around if if it does not conform to what you wanted. This fits better with What is duck typing? - and allows (similar to ABC-checking) the consumer to provide you with derived classes from list/dict while it still will work...

    0 讨论(0)
  • 2020-12-15 10:00

    First of all, even though I think you are aware but rather for the sake of completeness, the typing library contains types for type hints. These type hints are used by IDE's to check if your code is somewhat sane, and also serves as documentation what types a developer expects.

    To check whether a variable is a type of something, we have to use the isinstance function. Amazingly, we can use direct types of the typing library function, eg.

    from typing import List
    
    value = []
    isinstance(value, List)
    

    However, for nested structures such as List[Dict[str, int]] we cannot use this directly, because you funny enough get a TypeError. What you have to do is:

    1. Check if the initial value is a list
    2. Check if each item of the list is of type dict
    3. Check if each key of each dict is in fact a string and if each value is in fact an int

    Unfortunately, for strict checking python is a bit cumbersome. However, do be aware that python makes use of duck typing: if it is like a duck and behaves like a duck, then it definitely is a duck.

    0 讨论(0)
  • 2020-12-15 10:03

    Validating a type annotation is a non-trivial task. Python does not do it automatically, and writing your own validator is difficult because the typing module doesn't offer much of a useful interface. (In fact the internals of the typing module have changed so much since its introduction in python 3.5 that it's honestly a nightmare to work with.)

    Here's a type validator function taken from one of my personal projects (wall of code warning):

    import inspect
    import typing
    
    __all__ = ['is_instance', 'is_subtype', 'python_type', 'is_generic', 'is_base_generic', 'is_qualified_generic']
    
    
    if hasattr(typing, '_GenericAlias'):
        # python 3.7
        def _is_generic(cls):
            if isinstance(cls, typing._GenericAlias):
                return True
    
            if isinstance(cls, typing._SpecialForm):
                return cls not in {typing.Any}
    
            return False
    
    
        def _is_base_generic(cls):
            if isinstance(cls, typing._GenericAlias):
                if cls.__origin__ in {typing.Generic, typing._Protocol}:
                    return False
    
                if isinstance(cls, typing._VariadicGenericAlias):
                    return True
    
                return len(cls.__parameters__) > 0
    
            if isinstance(cls, typing._SpecialForm):
                return cls._name in {'ClassVar', 'Union', 'Optional'}
    
            return False
    
    
        def _get_base_generic(cls):
            # subclasses of Generic will have their _name set to None, but
            # their __origin__ will point to the base generic
            if cls._name is None:
                return cls.__origin__
            else:
                return getattr(typing, cls._name)
    
    
        def _get_python_type(cls):
            """
            Like `python_type`, but only works with `typing` classes.
            """
            return cls.__origin__
    
    
        def _get_name(cls):
            return cls._name
    else:
        # python <3.7
        if hasattr(typing, '_Union'):
            # python 3.6
            def _is_generic(cls):
                if isinstance(cls, (typing.GenericMeta, typing._Union, typing._Optional, typing._ClassVar)):
                    return True
    
                return False
    
    
            def _is_base_generic(cls):
                if isinstance(cls, (typing.GenericMeta, typing._Union)):
                    return cls.__args__ in {None, ()}
    
                if isinstance(cls, typing._Optional):
                    return True
    
                return False
        else:
            # python 3.5
            def _is_generic(cls):
                if isinstance(cls, (typing.GenericMeta, typing.UnionMeta, typing.OptionalMeta, typing.CallableMeta, typing.TupleMeta)):
                    return True
    
                return False
    
    
            def _is_base_generic(cls):
                if isinstance(cls, typing.GenericMeta):
                    return all(isinstance(arg, typing.TypeVar) for arg in cls.__parameters__)
    
                if isinstance(cls, typing.UnionMeta):
                    return cls.__union_params__ is None
    
                if isinstance(cls, typing.TupleMeta):
                    return cls.__tuple_params__ is None
    
                if isinstance(cls, typing.CallableMeta):
                    return cls.__args__ is None
    
                if isinstance(cls, typing.OptionalMeta):
                    return True
    
                return False
    
    
        def _get_base_generic(cls):
            try:
                return cls.__origin__
            except AttributeError:
                pass
    
            name = type(cls).__name__
            if not name.endswith('Meta'):
                raise NotImplementedError("Cannot determine base of {}".format(cls))
    
            name = name[:-4]
            return getattr(typing, name)
    
    
        def _get_python_type(cls):
            """
            Like `python_type`, but only works with `typing` classes.
            """
            # Many classes actually reference their corresponding abstract base class from the abc module
            # instead of their builtin variant (i.e. typing.List references MutableSequence instead of list).
            # We're interested in the builtin class (if any), so we'll traverse the MRO and look for it there.
            for typ in cls.mro():
                if typ.__module__ == 'builtins' and typ is not object:
                    return typ
    
            try:
                return cls.__extra__
            except AttributeError:
                pass
    
            if is_qualified_generic(cls):
                cls = get_base_generic(cls)
    
            if cls is typing.Tuple:
                return tuple
    
            raise NotImplementedError("Cannot determine python type of {}".format(cls))
    
    
        def _get_name(cls):
            try:
                return cls.__name__
            except AttributeError:
                return type(cls).__name__[1:]
    
    
    if hasattr(typing.List, '__args__'):
        # python 3.6+
        def _get_subtypes(cls):
            subtypes = cls.__args__
    
            if get_base_generic(cls) is typing.Callable:
                if len(subtypes) != 2 or subtypes[0] is not ...:
                    subtypes = (subtypes[:-1], subtypes[-1])
    
            return subtypes
    else:
        # python 3.5
        def _get_subtypes(cls):
            if isinstance(cls, typing.CallableMeta):
                if cls.__args__ is None:
                    return ()
    
                return cls.__args__, cls.__result__
    
            for name in ['__parameters__', '__union_params__', '__tuple_params__']:
                try:
                    subtypes = getattr(cls, name)
                    break
                except AttributeError:
                    pass
            else:
                raise NotImplementedError("Cannot extract subtypes from {}".format(cls))
    
            subtypes = [typ for typ in subtypes if not isinstance(typ, typing.TypeVar)]
            return subtypes
    
    
    def is_generic(cls):
        """
        Detects any kind of generic, for example `List` or `List[int]`. This includes "special" types like
        Union and Tuple - anything that's subscriptable, basically.
        """
        return _is_generic(cls)
    
    
    def is_base_generic(cls):
        """
        Detects generic base classes, for example `List` (but not `List[int]`)
        """
        return _is_base_generic(cls)
    
    
    def is_qualified_generic(cls):
        """
        Detects generics with arguments, for example `List[int]` (but not `List`)
        """
        return is_generic(cls) and not is_base_generic(cls)
    
    
    def get_base_generic(cls):
        if not is_qualified_generic(cls):
            raise TypeError('{} is not a qualified Generic and thus has no base'.format(cls))
    
        return _get_base_generic(cls)
    
    
    def get_subtypes(cls):
        return _get_subtypes(cls)
    
    
    def _instancecheck_iterable(iterable, type_args):
        if len(type_args) != 1:
            raise TypeError("Generic iterables must have exactly 1 type argument; found {}".format(type_args))
    
        type_ = type_args[0]
        return all(is_instance(val, type_) for val in iterable)
    
    
    def _instancecheck_mapping(mapping, type_args):
        return _instancecheck_itemsview(mapping.items(), type_args)
    
    
    def _instancecheck_itemsview(itemsview, type_args):
        if len(type_args) != 2:
            raise TypeError("Generic mappings must have exactly 2 type arguments; found {}".format(type_args))
    
        key_type, value_type = type_args
        return all(is_instance(key, key_type) and is_instance(val, value_type) for key, val in itemsview)
    
    
    def _instancecheck_tuple(tup, type_args):
        if len(tup) != len(type_args):
            return False
    
        return all(is_instance(val, type_) for val, type_ in zip(tup, type_args))
    
    
    _ORIGIN_TYPE_CHECKERS = {}
    for class_path, check_func in {
                            # iterables
                            'typing.Container': _instancecheck_iterable,
                            'typing.Collection': _instancecheck_iterable,
                            'typing.AbstractSet': _instancecheck_iterable,
                            'typing.MutableSet': _instancecheck_iterable,
                            'typing.Sequence': _instancecheck_iterable,
                            'typing.MutableSequence': _instancecheck_iterable,
                            'typing.ByteString': _instancecheck_iterable,
                            'typing.Deque': _instancecheck_iterable,
                            'typing.List': _instancecheck_iterable,
                            'typing.Set': _instancecheck_iterable,
                            'typing.FrozenSet': _instancecheck_iterable,
                            'typing.KeysView': _instancecheck_iterable,
                            'typing.ValuesView': _instancecheck_iterable,
                            'typing.AsyncIterable': _instancecheck_iterable,
    
                            # mappings
                            'typing.Mapping': _instancecheck_mapping,
                            'typing.MutableMapping': _instancecheck_mapping,
                            'typing.MappingView': _instancecheck_mapping,
                            'typing.ItemsView': _instancecheck_itemsview,
                            'typing.Dict': _instancecheck_mapping,
                            'typing.DefaultDict': _instancecheck_mapping,
                            'typing.Counter': _instancecheck_mapping,
                            'typing.ChainMap': _instancecheck_mapping,
    
                            # other
                            'typing.Tuple': _instancecheck_tuple,
                        }.items():
        try:
            cls = eval(class_path)
        except AttributeError:
            continue
    
        _ORIGIN_TYPE_CHECKERS[cls] = check_func
    
    
    def _instancecheck_callable(value, type_):
        if not callable(value):
            return False
    
        if is_base_generic(type_):
            return True
    
        param_types, ret_type = get_subtypes(type_)
        sig = inspect.signature(value)
    
        missing_annotations = []
    
        if param_types is not ...:
            if len(param_types) != len(sig.parameters):
                return False
    
            # FIXME: add support for TypeVars
    
            # if any of the existing annotations don't match the type, we'll return False.
            # Then, if any annotations are missing, we'll throw an exception.
            for param, expected_type in zip(sig.parameters.values(), param_types):
                param_type = param.annotation
                if param_type is inspect.Parameter.empty:
                    missing_annotations.append(param)
                    continue
    
                if not is_subtype(param_type, expected_type):
                    return False
    
        if sig.return_annotation is inspect.Signature.empty:
            missing_annotations.append('return')
        else:
            if not is_subtype(sig.return_annotation, ret_type):
                return False
    
        if missing_annotations:
            raise ValueError("Missing annotations: {}".format(missing_annotations))
    
        return True
    
    
    def _instancecheck_union(value, type_):
        types = get_subtypes(type_)
        return any(is_instance(value, typ) for typ in types)
    
    
    def _instancecheck_type(value, type_):
        # if it's not a class, return False
        if not isinstance(value, type):
            return False
    
        if is_base_generic(type_):
            return True
    
        type_args = get_subtypes(type_)
        if len(type_args) != 1:
            raise TypeError("Type must have exactly 1 type argument; found {}".format(type_args))
    
        return is_subtype(value, type_args[0])
    
    
    _SPECIAL_INSTANCE_CHECKERS = {
        'Union': _instancecheck_union,
        'Callable': _instancecheck_callable,
        'Type': _instancecheck_type,
        'Any': lambda v, t: True,
    }
    
    
    def is_instance(obj, type_):
        if type_.__module__ == 'typing':
            if is_qualified_generic(type_):
                base_generic = get_base_generic(type_)
            else:
                base_generic = type_
            name = _get_name(base_generic)
    
            try:
                validator = _SPECIAL_INSTANCE_CHECKERS[name]
            except KeyError:
                pass
            else:
                return validator(obj, type_)
    
        if is_base_generic(type_):
            python_type = _get_python_type(type_)
            return isinstance(obj, python_type)
    
        if is_qualified_generic(type_):
            python_type = _get_python_type(type_)
            if not isinstance(obj, python_type):
                return False
    
            base = get_base_generic(type_)
            try:
                validator = _ORIGIN_TYPE_CHECKERS[base]
            except KeyError:
                raise NotImplementedError("Cannot perform isinstance check for type {}".format(type_))
    
            type_args = get_subtypes(type_)
            return validator(obj, type_args)
    
        return isinstance(obj, type_)
    
    
    def is_subtype(sub_type, super_type):
        if not is_generic(sub_type):
            python_super = python_type(super_type)
            return issubclass(sub_type, python_super)
    
        # at this point we know `sub_type` is a generic
        python_sub = python_type(sub_type)
        python_super = python_type(super_type)
        if not issubclass(python_sub, python_super):
            return False
    
        # at this point we know that `sub_type`'s base type is a subtype of `super_type`'s base type.
        # If `super_type` isn't qualified, then there's nothing more to do.
        if not is_generic(super_type) or is_base_generic(super_type):
            return True
    
        # at this point we know that `super_type` is a qualified generic... so if `sub_type` isn't
        # qualified, it can't be a subtype.
        if is_base_generic(sub_type):
            return False
    
        # at this point we know that both types are qualified generics, so we just have to
        # compare their sub-types.
        sub_args = get_subtypes(sub_type)
        super_args = get_subtypes(super_type)
        return all(is_subtype(sub_arg, super_arg) for sub_arg, super_arg in zip(sub_args, super_args))
    
    
    def python_type(annotation):
        """
        Given a type annotation or a class as input, returns the corresponding python class.
    
        Examples:
    
        ::
            >>> python_type(typing.Dict)
            <class 'dict'>
            >>> python_type(typing.List[int])
            <class 'list'>
            >>> python_type(int)
            <class 'int'>
        """
        try:
            mro = annotation.mro()
        except AttributeError:
            # if it doesn't have an mro method, it must be a weird typing object
            return _get_python_type(annotation)
    
        if Type in mro:
            return annotation.python_type
        elif annotation.__module__ == 'typing':
            return _get_python_type(annotation)
        else:
            return annotation
    

    Demonstration:

    >>> is_instance([{'x': 3}], List[Dict[str, int]])
    True
    >>> is_instance([{'x': 3}, {'y': 7.5}], List[Dict[str, int]])
    False
    

    (As far as I'm aware, this supports all python versions, even the ones <3.5 using the typing module backport.)

    0 讨论(0)
  • 2020-12-15 10:08

    The common way to handle this is by making use of the fact that if whatever object you pass to myfun doesn't have the required functionality a corresponding exception will be raised (usually TypeError or AttributeError). So you would do the following:

    try:
        myfun(data)
    except (TypeError, AttributeError) as err:
        # Fallback for invalid types here.
    

    You indicate in your question that you would raise a TypeError if the passed object does not have the appropriate structure but Python does this already for you. The critical question is how you would handle this case. You could also move the try / except block into myfun, if appropriate. When it comes to typing in Python you usually rely on duck typing: if the object has the required functionality then you don't care much about what type it is, as long as it serves the purpose.

    Consider the following example. We just pass the data into the function and then get the AttributeError for free (which we can then except); no need for manual type checking:

    >>> def myfun(data):
    ...     for x in data:
    ...             print(x.items())
    ... 
    >>> data = json.loads('[[["a", 1], ["b", 2]], [["c", 3], ["d", 4]]]')
    >>> myfun(data)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "<stdin>", line 3, in myfun
    AttributeError: 'list' object has no attribute 'items'
    

    In case you are concerned about the usefulness of the resulting error, you could still except and then re-raise a custom exception (or even change the exception's message):

    try:
        myfun(data)
    except (TypeError, AttributeError) as err:
        raise TypeError('Data has incorrect structure') from err
    
    try:
        myfun(data)
    except (TypeError, AttributeError) as err:
        err.args = ('Data has incorrect structure',)
        raise
    

    When using third-party code one should always check the documentation for exceptions that will be raised. For example numpy.inner reports that it will raise a ValueError under certain circumstances. When using that function we don't need to perform any checks ourselves but rely on the fact that it will raise the error if needed. When using third-party code for which it is not clear how it will behave in some corner-cases, i.m.o. it is easier and clearer to just hardcode a corresponding type checker (see below) instead of using a generic solution that works for any type. These cases should be rare anyway and leaving a corresponding comment makes your fellow developers aware of the situation.

    The typing library is for type-hinting and as such it won't be checking the types at runtime. Sure you could do this manually but it is rather cumbersome:

    def type_checker(data):
        return (
            isinstance(data, list)
            and all(isinstance(x, dict) for x in list)
            and all(isinstance(k, str) and isinstance(v, int) for x in list for k, v in x.items())
        )
    

    This together with an appropriate comment is still an acceptable solution and it is reusable where a similar data structure is expected. The intent is clear and the code is easily verifiable.

    0 讨论(0)
提交回复
热议问题