We are aware that the standard method of setting a single cell is using at
or iat
. However, I noticed some interesting behaviour I was wondering if
This occurs because loc
does a bunch of checking for all the many usecases which it supports. (Note: The history was that loc
and iloc
were created to remove ambiguity of ix
, way back in 2013 v0.11, but even today there's still a lot of ambiguity in loc
.)
In this case df.loc[1, 'B']
can either return:
Aside: iloc
suffers the same issue in this case, even though it's always going to be the first case, but that may be because loc and iloc share this assignment code.
So that pandas needs to support all of those cases for assignment!
An early part of the assignment logic converts the list (of lists) into a numpy array:
In [11]: np.array(['m', 'n', 'o', 'p']).shape
Out[11]: (4,)
In [12]: np.array([['m', 'n', 'o', 'p']]).shape
Out[12]: (1, 4)
So you can't just pass the list of lists and expect to get the right array. Instead you could to explictly set into an object array:
In [13]: a = np.empty(1, dtype=object)
In [14]: a[0] = ['m', 'n', 'o', 'p']
In [15]: a
Out[15]: array([list(['m', 'n', 'o', 'p'])], dtype=object)
Now you can use this in the assignment:
In [16]: df.loc[0, 'B'] = a
In [17]: df
Out[17]:
A B
0 12 [m, n, o, p]
1 23 [c, d]
It's still not ideal, but to reiterate there are sooo many edge cases in loc
and iloc
, that the solution is to be as explicit as possible to avoid them (use at
here). And more generally, as you know, avoid using lists inside a DataFrame!