问题
Python 3.7 is around the corner, and I wanted to test some of the fancy new dataclass
+typing features. Getting hints to work right is easy enough, with both native types and those from the typing
module:
>>> import dataclasses
>>> import typing as ty
>>>
... @dataclasses.dataclass
... class Structure:
... a_str: str
... a_str_list: ty.List[str]
...
>>> my_struct = Structure(a_str=\'test\', a_str_list=[\'t\', \'e\', \'s\', \'t\'])
>>> my_struct.a_str_list[0]. # IDE suggests all the string methods :)
But one other thing that I wanted to try was forcing the type hints as conditions during runtime, i.e. it should not be possible for a dataclass
with incorrect types to exist. It can be implemented nicely with __post_init__:
>>> @dataclasses.dataclass
... class Structure:
... a_str: str
... a_str_list: ty.List[str]
...
... def validate(self):
... ret = True
... for field_name, field_def in self.__dataclass_fields__.items():
... actual_type = type(getattr(self, field_name))
... if actual_type != field_def.type:
... print(f\"\\t{field_name}: \'{actual_type}\' instead of \'{field_def.type}\'\")
... ret = False
... return ret
...
... def __post_init__(self):
... if not self.validate():
... raise ValueError(\'Wrong types\')
This kind of validate
function works for native types and custom classes, but not those specified by the typing
module:
>>> my_struct = Structure(a_str=\'test\', a_str_list=[\'t\', \'e\', \'s\', \'t\'])
Traceback (most recent call last):
a_str_list: \'<class \'list\'>\' instead of \'typing.List[str]\'
ValueError: Wrong types
Is there a better approach to validate an untyped list with a typing
-typed one? Preferably one that doesn\'t include checking the types of all elements in any list
, dict
, tuple
, or set
that is a dataclass
\' attribute.
回答1:
Instead of checking for type equality, you should use isinstance
. But you cannot use a parametrized generic type (typing.List[int]
) to do so, you must use the "generic" version (typing.List
). So you will be able to check for the container type but not the contained types. Parametrized generic types define an __origin__
attribute that you can use for that.
Contrary to Python 3.6, in Python 3.7 most type hints have a useful __origin__
attribute. Compare:
# Python 3.6
>>> import typing
>>> typing.List.__origin__
>>> typing.List[int].__origin__
typing.List
and
# Python 3.7
>>> import typing
>>> typing.List.__origin__
<class 'list'>
>>> typing.List[int].__origin__
<class 'list'>
Python 3.8 introduce even better support with the typing.get_origin() introspection function:
# Python 3.8
>>> import typing
>>> typing.get_origin(typing.List)
<class 'list'>
>>> typing.get_origin(typing.List[int])
<class 'list'>
Notable exceptions being typing.Any
, typing.Union
and typing.ClassVar
… Well, anything that is a typing._SpecialForm
does not define __origin__
. Fortunately:
>>> isinstance(typing.Union, typing._SpecialForm)
True
>>> isinstance(typing.Union[int, str], typing._SpecialForm)
False
>>> typing.get_origin(typing.Union[int, str])
typing.Union
But parametrized types define an __args__
attribute that store their parameters as a tuple; Python 3.8 introduce the typing.get_args() function to retrieve them:
# Python 3.7
>>> typing.Union[int, str].__args__
(<class 'int'>, <class 'str'>)
# Python 3.8
>>> typing.get_args(typing.Union[int, str])
(<class 'int'>, <class 'str'>)
So we can improve type checking a bit:
for field_name, field_def in self.__dataclass_fields__.items():
if isinstance(field_def.type, typing._SpecialForm):
# No check for typing.Any, typing.Union, typing.ClassVar (without parameters)
continue
try:
actual_type = field_def.type.__origin__
except AttributeError:
# In case of non-typing types (such as <class 'int'>, for instance)
actual_type = field_def.type
# In Python 3.8 one would replace the try/except with
# actual_type = typing.get_origin(field_def.type) or field_def.type
if isinstance(actual_type, typing._SpecialForm):
# case of typing.Union[…] or typing.ClassVar[…]
actual_type = field_def.type.__args__
actual_value = getattr(self, field_name)
if not isinstance(actual_value, actual_type):
print(f"\t{field_name}: '{type(actual_value)}' instead of '{field_def.type}'")
ret = False
This is not perfect as it won't account for typing.ClassVar[typing.Union[int, str]]
or typing.Optional[typing.List[int]]
for instance, but it should get things started.
Next is the way to apply this check.
Instead of using __post_init__
, I would go the decorator route: this could be used on anything with type hints, not only dataclasses
:
import inspect
import typing
from contextlib import suppress
from functools import wraps
def enforce_types(callable):
spec = inspect.getfullargspec(callable)
def check_types(*args, **kwargs):
parameters = dict(zip(spec.args, args))
parameters.update(kwargs)
for name, value in parameters.items():
with suppress(KeyError): # Assume un-annotated parameters can be any type
type_hint = spec.annotations[name]
if isinstance(type_hint, typing._SpecialForm):
# No check for typing.Any, typing.Union, typing.ClassVar (without parameters)
continue
try:
actual_type = type_hint.__origin__
except AttributeError:
# In case of non-typing types (such as <class 'int'>, for instance)
actual_type = type_hint
# In Python 3.8 one would replace the try/except with
# actual_type = typing.get_origin(type_hint) or type_hint
if isinstance(actual_type, typing._SpecialForm):
# case of typing.Union[…] or typing.ClassVar[…]
actual_type = type_hint.__args__
if not isinstance(value, actual_type):
raise TypeError('Unexpected type for \'{}\' (expected {} but found {})'.format(name, type_hint, type(value)))
def decorate(func):
@wraps(func)
def wrapper(*args, **kwargs):
check_types(*args, **kwargs)
return func(*args, **kwargs)
return wrapper
if inspect.isclass(callable):
callable.__init__ = decorate(callable.__init__)
return callable
return decorate(callable)
Usage being:
@enforce_types
@dataclasses.dataclass
class Point:
x: float
y: float
@enforce_types
def foo(bar: typing.Union[int, str]):
pass
Appart from validating some type hints as suggested in the previous section, this approach still have some drawbacks:
- type hints using strings (
class Foo: def __init__(self: 'Foo'): pass
) are not taken into account byinspect.getfullargspec
: you may want to use typing.get_type_hints and inspect.signature instead; a default value which is not the appropriate type is not validated:
@enforce_type def foo(bar: int = None): pass foo()
does not raise any
TypeError
. You may want to use inspect.Signature.bind in conjuction with inspect.BoundArguments.apply_defaults if you want to account for that (and thus forcing you to definedef foo(bar: typing.Optional[int] = None)
);- variable number of arguments can't be validated as you would have to define something like
def foo(*args: typing.Sequence, **kwargs: typing.Mapping)
and, as said at the beginning, we can only validate containers and not contained objects.
Thanks to @Aran-Fey that helped me improve this answer.
回答2:
Just found this question.
pydantic can do full type validation for dataclasses out of the box. (admission: I built pydantic)
Just use pydantic's version of the decorator, the resulting dataclass is completely vanilla.
from datetime import datetime
from pydantic.dataclasses import dataclass
@dataclass
class User:
id: int
name: str = 'John Doe'
signup_ts: datetime = None
print(User(id=42, signup_ts='2032-06-21T12:00'))
"""
User(id=42, name='John Doe', signup_ts=datetime.datetime(2032, 6, 21, 12, 0))
"""
User(id='not int', signup_ts='2032-06-21T12:00')
The last line will give:
...
pydantic.error_wrappers.ValidationError: 1 validation error
id
value is not a valid integer (type=type_error.integer)
来源:https://stackoverflow.com/questions/50563546/validating-detailed-types-in-python-dataclasses