I have a pandas dataframe and a list as follows
mylist = [\'nnn\', \'mmm\', \'yyy\']
mydata =
xxx yyy zzz nnn ffffd mmm
0 0 10 5 5 5 5
1 1 9
You can just put mylist
inside []
and pandas will select it for you.
mydata_new = mydata[mylist]
Not sure whether your yyy
is a typo.
The reason that you are wrong is that you are assigning mydata_new
to a new series every time in the loop.
for item in mylist:
mydata_new = mydata[item] # <-
Thus, it will create a series rather than the whole df you want.
If some names in the list is not in your data frame, you can always check it with,
len(set(mylist) - set(mydata.columns)) > 0
and print it out
print(set(mylist) - set(mydata.columns))
Then see if there are typos or other unintended behaviors.