Why does splatting create a tuple on the rhs but a list on the lhs?

流过昼夜 提交于 2019-12-03 10:22:27

问题


Consider, for example,

squares = *map((2).__rpow__, range(5)),
squares
# (0, 1, 4, 9, 16)

*squares, = map((2).__rpow__, range(5))
squares
# [0, 1, 4, 9, 16]

So, all else being equal we get a list when splatting on the lhs and a tuple when splatting on the rhs.

Why?

Is this by design, and if yes, what's the rationale? Or, if not, are there any technical reasons? Or is this just how it is, no particular reason?


回答1:


The fact that you get a tuple on the RHS has nothing to do with the splat. The splat just unpacks your map iterator. What you unpack it into is decided by the fact that you've used tuple syntax:

*whatever,

instead of list syntax:

[*whatever]

or set syntax:

{*whatever}

You could have gotten a list or a set. You just told Python to make a tuple.


On the LHS, a splatted assignment target always produces a list. It doesn't matter whether you use "tuple-style"

*target, = whatever

or "list-style"

[*target] = whatever

syntax for the target list. The syntax looks a lot like the syntax for creating a list or tuple, but target list syntax is an entirely different thing.

The syntax you're using on the left was introduced in PEP 3132, to support use cases like

first, *rest = iterable

In an unpacking assignment, elements of an iterable are assigned to unstarred targets by position, and if there's a starred target, any extras are stuffed into a list and assigned to that target. A list was chosen instead of a tuple to make further processing easier. Since you have only a starred target in your example, all items go in the "extras" list assigned to that target.




回答2:


This is specified in PEP-0448 disadvantages

Whilst *elements, = iterable causes elements to be a list, elements = *iterable, causes elements to be a tuple. The reason for this may confuse people unfamiliar with the construct.

Also as per: PEP-3132 specification

This PEP proposes a change to iterable unpacking syntax, allowing to specify a "catch-all" name which will be assigned a list of all items not assigned to a "regular" name.

Also mentioned here: Python-3 exprlists

Except when part of a list or set display, an expression list containing at least one comma yields a tuple.
The trailing comma is required only to create a single tuple (a.k.a. a singleton); it is optional in all other cases. A single expression without a trailing comma doesn’t create a tuple, but rather yields the value of that expression. (To create an empty tuple, use an empty pair of parentheses: ().)

This might also be seen in a simpler example here, where elements in a list

In [27]: *elements, = range(6)                                                                                                                                                      

In [28]: elements                                                                                                                                                                   
Out[28]: [0, 1, 2, 3, 4, 5]

and here, where elements is a tuple

In [13]: elements = *range(6),                                                                                                                                                      

In [14]: elements                                                                                                                                                                   
Out[14]: (0, 1, 2, 3, 4, 5)

From what I could understand from the comments and the other answers:

  • The first behaviour is to keep in-line with the existing arbitrary argument lists used in functions ie.*args

  • The second behaviour is to be able to use the variables on LHS further down in the evaluation, so making it a list, a mutable value rather than a tuple makes more sense




回答3:


There is an indication of the reason why at the end of PEP 3132 -- Extended Iterable Unpacking:

Acceptance

After a short discussion on the python-3000 list [1], the PEP was accepted by Guido in its current form. Possible changes discussed were:

[...]

Make the starred target a tuple instead of a list. This would be consistent with a function's *args, but make further processing of the result harder.

[1] https://mail.python.org/pipermail/python-3000/2007-May/007198.html

So, the advantage of having a mutable list instead of an immutable tuple seems to be the reason.




回答4:


not a complete answer, but disassembling gives some clues:

from dis import dis

def a():
    squares = (*map((2).__rpow__, range(5)),)
    # print(squares)

print(dis(a))

disassembles as

  5           0 LOAD_GLOBAL              0 (map)
              2 LOAD_CONST               1 (2)
              4 LOAD_ATTR                1 (__rpow__)
              6 LOAD_GLOBAL              2 (range)
              8 LOAD_CONST               2 (5)
             10 CALL_FUNCTION            1
             12 CALL_FUNCTION            2
             14 BUILD_TUPLE_UNPACK       1
             16 STORE_FAST               0 (squares)
             18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

while

def b():
    *squares, = map((2).__rpow__, range(5))
print(dis(b))

results in

 11           0 LOAD_GLOBAL              0 (map)
              2 LOAD_CONST               1 (2)
              4 LOAD_ATTR                1 (__rpow__)
              6 LOAD_GLOBAL              2 (range)
              8 LOAD_CONST               2 (5)
             10 CALL_FUNCTION            1
             12 CALL_FUNCTION            2
             14 UNPACK_EX                0
             16 STORE_FAST               0 (squares)
             18 LOAD_CONST               0 (None)
             20 RETURN_VALUE

the doc on UNPACK_EX states:

UNPACK_EX(counts)

Implements assignment with a starred target: Unpacks an iterable in TOS into individual values, where the total number of values can be smaller than the number of items in the iterable: one of the new values will be a list of all leftover items.

The low byte of counts is the number of values before the list value, the high byte of counts the number of values after it. The resulting values are put onto the stack right-to-left.

(emphasis mine). while BUILD_TUPLE_UNPACK returns a tuple:

BUILD_TUPLE_UNPACK(count)

Pops count iterables from the stack, joins them in a single tuple, and pushes the result. Implements iterable unpacking in tuple displays (*x, *y, *z).




回答5:


For the RHS, there is not much of an issue. the answer here states it well:

We have it working as it usually does in function calls. It expands the contents of the iterable it is attached to. So, the statement:

elements = *iterable

can be viewed as:

elements = 1, 2, 3, 4,

which is another way for a tuple to be initialized.

Now, for the LHS, Yes, there are technical reasons for the LHS using a list, as indicated in the discussion around the initial PEP 3132 for extending unpacking

The reasons can be gleaned from the conversation on the PEP(added at the end).

Essentially it boils down to a couple key factors:

  • The LHS needed to support a "starred expression" that was not necessarily restricted to the end only.
  • The RHS needed to allow various sequence types to be accepted, including iterators.
  • The combination of the two points above required manipulation/mutation of the contents after accepting them into the starred expression.
  • An alternative approach to handling, one to mimic the iterator fed on the RHS, even leaving implementation difficulties aside, was shot down by Guido for its inconsistent behaviour.
  • Given all the factors above, a tuple on LHS would have to be a list first, and then converted. This approach would then just add overhead, and did not invite any further discussion.

Summary: A combination of various factors led to the decision to allow a list on the LHS, and the reasons fed off of each other.


Relevant extract for disallowing inconsistent types:

The important use case in Python for the proposed semantics is when you have a variable-length record, the first few items of which are interesting, and the rest of which is less so, but not unimportant. (If you wanted to throw the rest away, you'd just write a, b, c = x[:3] instead of a, b, c, *d = x.) It is much more convenient for this use case if the type of d is fixed by the operation, so you can count on its behavior.

There's a bug in the design of filter() in Python 2 (which will be fixed in 3.0 by turning it into an iterator BTW): if the input is a tuple, the output is a tuple too, but if the input is a list or anything else, the output is a list. That's a totally insane signature, since it means that you can't count on the result being a list, nor on it being a tuple -- if you need it to be one or the other, you have to convert it to one, which is a waste of time and space. Please let's not repeat this design bug. -Guido


I have also tried to recreate a partially quoted conversation that pertains to the summary above.Source Emphasis mine.

1.

In argument lists, *args exhausts iterators, converting them to tuples. I think it would be confusing if *args in tuple unpacking didn't do the same thing.

This brings up the question of why the patch produces lists, not tuples. What's the reasoning behind that?

STeVe

2.

IMO, it's likely that you would like to further process the resulting sequence, including modifying it.

Georg

3.

Well if that's what you're aiming at, then I'd expect it to be more useful to have the unpacking generate not lists, but the same type you started with, e.g. if I started with a string, I probably want to continue using strings:: --additional text snipped off

4.

When dealing with an iterator, you don't know the length in advance, so the only way to get a tuple would be to produce a list first and then create a tuple from it. Greg

5.

Yep. That was one of the reasons it was suggested that the *args should only appear at the end of the tuple unpacking.

STeVe

couple convos skipped

6.

I don't think that returning the type given is a goal that should be attempted, because it can only ever work for a fixed set of known types. Given an arbitrary sequence type, there is no way of knowing how to create a new instance of it with specified contents.

-- Greg

skipped convos

7.

I'm suggesting, that:

  • lists return lists
  • tuples return tuples
  • XYZ containers return XYZ containers
  • non-container iterables return iterators.

How do you propose to distinguish between the last two cases? Attempting to slice it and catching an exception is not acceptable, IMO, as it can too easily mask bugs.

-- Greg

8.

But I expect less useful. It won't support "a, *b, c = " either. From an implementation POV, if you have an unknown object on the RHS, you have to try slicing it before you try iterating over it; this may cause problems e.g. if the object happens to be a defaultdict -- since x[3:] is implemented as x[slice(None, 3, None)], the defaultdict will give you its default value. I'd much rather define this in terms of iterating over the object until it is exhausted, which can be optimized for certain known types like lists and tuples.

-- --Guido van Rossum




回答6:


TLDR: You get a tuple on the RHS because you asked for one. You get a list on the LHS because it is easier.


It is important to keep in mind that the RHS is evaluated before the LHS - this is why a, b = b, a works. The difference then becomes apparent when splitting the assignment and using additional capabilities for the LHS and RHS:

# RHS: Expression List
a = head, *tail
# LHS: Target List
*leading, last = a

In short, while the two look similar, they are entirely different things. The RHS is an expression to create one tuple from all names - the LHS is a binding to multiple names from one tuple. Even if you see the LHS as a tuple of names, that does not restrict the type of each name.


The RHS is an expression list - a tuple literal without the optional () parentheses. This is the same as how 1, 2 creates a tuple even without parentheses, and how enclosing [] or {} create a list or set. The *tail just means unpacking into this tuple.

New in version 3.5: Iterable unpacking in expression lists, originally proposed by PEP 448.

The LHS does not create one value, it binds values to multiple names. With a catch-all name such as *leading, the binding is not known up-front in all cases. Instead, the catch-all contains whatever remains.

Using a list to store values makes this simples - the values for trailing names can be efficiently removed from the end. The remaining list then contains the exactly the values for the catch-all name. In fact, this is exactly what CPython does:

  • collect all items for mandatory targets before the starred one
  • collect all remaining items from the iterable in a list
  • pop items for mandatory targets after the starred one from the list
  • push the single items and the resized list on the stack

Even when the LHS has a catch-all name without trailing names, it is a list for consistency.




回答7:


Using a = *b,:

If you do:

a = *[1, 2, 3],

It would give:

(1, 2, 3)

Because:

  1. Unpacking and some other stuff give tuples in default, but if you say i.e

    [*[1, 2, 3]]

    Output:

    [1, 2, 3] as a list since I do a list, so {*[1, 2, 3]} will give a set.

  2. Unpacking gives three elements, and for [1, 2, 3] it really just does

    1, 2, 3

    Which outputs:

    (1, 2, 3)

    That's what unpacking does.

The main part:

Unpacking simply executes:

1, 2, 3

For:

[1, 2, 3]

Which is a tuple:

(1, 2, 3)

Actually this creates a list, and changes it into a tuple.

Using *a, = b:

Well, this is really gonna be:

a = [1, 2, 3]

Since it isn't:

*a, b = [1, 2, 3]

Or something similar, there is not much about this.

  1. It is equivalent without * and ,, not fully, it just always gives a list.

  2. This is really almost only used for multiple variables i.e:

    *a, b = [1, 2, 3]

One thing is that no matter what it stores a list type:

>>> *a, = {1,2,3}
>>> a
[1, 2, 3]
>>> *a, = (1,2,3)
>>> a
[1, 2, 3]
>>> 

Also it would be a strange to have:

a, *b = 'hello'

And:

print(b)

To be:

'ello'

Then it doesn't seem like splatting.

Also list have more functions than others, easier to handle.

There is probably no reason for this to happen, it really a decision in Python.

The a = *b, section there is a reason, in the "The main part:" section.

Summary:

Also as @Devesh mentioned here in PEP 0448 disadvantages:

Whilst *elements, = iterable causes elements to be a list, elements = *iterable, causes elements to be a tuple. The reason for this may confuse people unfamiliar with the construct.

(emphasis mine)

Why bother, this doesn't really matter for us, why not just use the below if you want a list:

print([*a])

Or a tuple:

print((*a))

And a set:

print({*a})

And so on...



来源:https://stackoverflow.com/questions/56237733/why-does-splatting-create-a-tuple-on-the-rhs-but-a-list-on-the-lhs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!