Color by Column Values in Matplotlib

后端 未结 4 1511
情话喂你
情话喂你 2020-11-28 07:50

One of my favorite aspects of using the ggplot2 library in R is the ability to easily specify aesthetics. I can quickly make a scatterplot and apply color assoc

相关标签:
4条回答
  • 2020-11-28 08:29

    You can use the color parameter to the plot method to define the colors you want for each column. For example:

    from pandas import DataFrame
    data = DataFrame({'a':range(5),'b':range(1,6),'c':range(2,7)})
    colors = ['yellowgreen','cyan','magenta']
    data.plot(color=colors)
    

    Three lines with custom colors

    You can use color names or Color hex codes like '#000000' for black say. You can find all the defined color names in matplotlib's color.py file. Below is the link for the color.py file in matplotlib's github repo.

    https://github.com/matplotlib/matplotlib/blob/master/lib/matplotlib/colors.py

    0 讨论(0)
  • 2020-11-28 08:31

    Actually you could use ggplot for python:

    from ggplot import *
    import numpy as np
    import pandas as pd
    
    df = pd.DataFrame({'Height':np.random.randn(10),
                       'Weight':np.random.randn(10),
                       'Gender': ["Male","Male","Male","Male","Male",
                                  "Female","Female","Female","Female","Female"]})
    
    
    ggplot(aes(x='Height', y='Weight', color='Gender'), data=df)  + geom_point()
    

    0 讨论(0)
  • 2020-11-28 08:36

    Update October 2015

    Seaborn handles this use-case splendidly:

    import numpy 
    import pandas
    from  matplotlib import pyplot
    import seaborn
    seaborn.set(style='ticks')
    
    numpy.random.seed(0)
    N = 37
    _genders= ['Female', 'Male', 'Non-binary', 'No Response']
    df = pandas.DataFrame({
        'Height (cm)': numpy.random.uniform(low=130, high=200, size=N),
        'Weight (kg)': numpy.random.uniform(low=30, high=100, size=N),
        'Gender': numpy.random.choice(_genders, size=N)
    })
    
    fg = seaborn.FacetGrid(data=df, hue='Gender', hue_order=_genders, aspect=1.61)
    fg.map(pyplot.scatter, 'Weight (kg)', 'Height (cm)').add_legend()
    

    Which immediately outputs:

    Old Answer

    In this case, I would use matplotlib directly.

    import numpy as np
    import matplotlib.pyplot as plt
    import pandas as pd
    
    def dfScatter(df, xcol='Height', ycol='Weight', catcol='Gender'):
        fig, ax = plt.subplots()
        categories = np.unique(df[catcol])
        colors = np.linspace(0, 1, len(categories))
        colordict = dict(zip(categories, colors))  
    
        df["Color"] = df[catcol].apply(lambda x: colordict[x])
        ax.scatter(df[xcol], df[ycol], c=df.Color)
        return fig
    
    if 1:
        df = pd.DataFrame({'Height':np.random.normal(size=10),
                           'Weight':np.random.normal(size=10),
                           'Gender': ["Male","Male","Unknown","Male","Male",
                                      "Female","Did not respond","Unknown","Female","Female"]})    
        fig = dfScatter(df)
        fig.savefig('fig1.png')
    

    And that gives me:

    scalle plot with categorized colors As far as I know, that color column can be any matplotlib compatible color (RBGA tuples, HTML names, hex values, etc).

    I'm having trouble getting anything but numerical values to work with the colormaps.

    0 讨论(0)
  • 2020-11-28 08:41

    https://seaborn.pydata.org/generated/seaborn.scatterplot.html

    import numpy 
    import pandas
    import seaborn as sns
    
    numpy.random.seed(0)
    N = 37
    _genders= ['Female', 'Male', 'Non-binary', 'No Response']
    df = pandas.DataFrame({
        'Height (cm)': numpy.random.uniform(low=130, high=200, size=N),
        'Weight (kg)': numpy.random.uniform(low=30, high=100, size=N),
        'Gender': numpy.random.choice(_genders, size=N)
    })
    
    sns.scatterplot(data=df, x='Height (cm)', y='Weight (kg)', hue='Gender')
    

    0 讨论(0)
提交回复
热议问题