Type hints: Is it a bad practice to alias primitive data types?

ぐ巨炮叔叔 提交于 2020-05-12 19:51:54

问题


In Python documentation for typing & type hints we have the below example:

Vector = List[float]

def scale(scalar: float, vector: Vector) -> Vector:
    return [scalar * num for num in vector]

Vector type alias clearly shows that type aliases are useful for simplifying complex type signatures.

However, what about aliasing primitive data types?

Let's contrast two basic examples of function signatures:

URL = str    

def process_url(url: URL) -> URL:
    pass

vs.

def process_url(url: str) -> str:
    pass

Version with type alias URL for primitive type str is:

  • self-documenting (among others, now I can skip documenting returned value, as it should be clearly an url),
  • resistant to type implementation change (I can switch URL to be Dict or namedtuple later on without changing functions signatures).

The problem is I cannot find anyone else following such practice. I am simply afraid that I am unintentionally abusing type hints to implement my own ideas instead of following their intended purpose.


回答1:


Using an alias to mark the meaning of a value can be misleading and dangerous. A NewType should be used instead.

Recall that the use of a type alias declares two types to be equivalent to one another. Doing Alias = Original will make the static type checker treat Alias as being exactly equivalent to Original in all cases. This is useful when you want to simplify complex type signatures.

Simple aliasing works both ways: any List[float] is a vector, and any str is a URL – which is usually not correct. A URL is a special kind of str and not any can take its place. An alias URL = str is a too strong statement of equality, as it cannot express this distinction. In fact, any inspection that does not look at the source code does not see the distinction:

In [1]: URL = str
In [2]: def foo(bar: URL):
   ...:     pass
   ...:
In [3]: foo?
Signature: foo(bar: str)

Consider that you alias Celsius = float in one module, and Fahrenheit = float in another. This signals that it is valid to use Celsius as Fahrenheit, which is wrong.

Unless your types do cary separative meaning, you should just take a url: str. The name signifies the meaning, the type the valid values. That means that your type should be suitable to separate valid and invalid values!

Use aliases to shorten your hints, but use NewType to refine them.

Vector = List[float]        # alias shortens
URL = NewType("URL", str)   # new type separates



回答2:


I am not sure if this question is opinion based, but I have a feeling the general agreement would be this is a good idea, in general. You state the benefits yourself, not to mention the ability to generalize code etc.

I would venture this is not common practice in Python as the language itself is not very restrictive. In addition, the variable is already called url - that is pretty self explanatory. You could argue you might have something called json_response or the like, and you expect it to be a url, and your method would certainly make it clear, but since Python encourages duck typing, the code usage often gives this hint anyway, and using type aliasing will be just extra safety for an inconsiderate user. It really goes down just to common practices, with no good "do that!" explanation.

Final point - type aliasing, in a sense, is the most primitive version of object oriented programming. You are making it clear what properties you are expecting of this object, in this case the string should be a valid URL.




回答3:


I guess the question one could ask oneself is "the purpose".

I strongly believe in Python's readability is all that matters. With this in mind type hinting, even for primitives is quite OK. Even better if type is masked by virtual "enum"-like type that does some self documenting.

That being said - personally I'd go with the first: URL = str
def process_url(url: URL) -> URL: pass




回答4:


I don't know what is the general perception but I consider it a good practice for things that repeat often as it gives you a single place to define what is meant.

Ad repetition, considering you have a lot of functions like

def foo(url : str):
    """
    :param url: explaining url
    """

You'd end up defining url at each of these functions so instead you can do

def foo(x : Url):
   pass

The trouble with type alias is that you can't document it so I've come to following

class _Url(str):
    """
    Here you can document the type
    """

Url = typing.Union[_Url, str]

This gets you

  1. the behavior of type alias from the call site point of view (no need to cast it)

  2. while allowing you to express the value meaning in type and

  3. being able to document the type itself

The only downside is that its not immediately obvious what the union means but its technically correct and I think the best that can be done at the moment.



来源:https://stackoverflow.com/questions/52504347/type-hints-is-it-a-bad-practice-to-alias-primitive-data-types

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!