Is there a way to make matplotlib scatter plot marker or color according to a discrete variable in a different column?

自古美人都是妖i 提交于 2020-01-22 15:04:33

问题


I'm making scatterplots out of a DF using matplotlib. In order to get different colors for each data set, I'm making two separate calls to plt.scatter:

plt.scatter(zzz['HFmV'], zzz['LFmV'], label = dut_groups[0], color = 'r' )
plt.scatter(qqq['HFmV'], qqq['LFmV'], label = dut_groups[1], color = 'b' )
plt.legend()
plt.show()

This gives me the desired color dependence but really what would be ideal is if I could just get pandas to give me the scatterplot with several datasets on the same plot by something like

df.plot(kind = scatter(x,y, color = df.Group, marker = df.Head)

Apparently there is no such animal (at least that I could find). So, next best thing in my mind would be to put the plt.scatter calls into a loop where I could make the color or marker vary according to one of the rows (not x or y, but some other row. If the row I want to use were a continuous variable it looks like I could use a colormap, but in my case the row I need to sue for this is a string ( categorical type of variable, not a number).

Any help much appreciated.


回答1:


What you're doing will almost work, but you have to pass color a vector of colors, not just a vector of variables. So you could do:

color = df.Group.map({dut_groups[0]: "r", dut_groups[1]: "b"})
plt.scatter(x, y, color=color)

Same goes for the marker style

You could also use seaborn to do the color-mapping the way you expect (as discussed here), although it doesn't do marker style mapping:

import seaborn as sns
import pandas as pd
from numpy.random import randn

data = pd.DataFrame(dict(x=randn(40), y=randn(40), g=["a", "b"] * 20))
sns.lmplot("x", "y", hue="g", data=data, fit_reg=False)



来源:https://stackoverflow.com/questions/24297097/is-there-a-way-to-make-matplotlib-scatter-plot-marker-or-color-according-to-a-di

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!