Plotting time-series data with seaborn

前端 未结 1 1983
庸人自扰
庸人自扰 2020-11-29 05:30

Say I create a fully random Dataframe using the following:

from pandas.util import testing
from random import randrange

def random_date(start,          


        
相关标签:
1条回答
  • 2020-11-29 06:08

    I don't think tsplot is going to work with the data you have. The assumptions it makes about the input data are that you've sampled the same units at each timepoint (although you can have missing timepoints for some units).

    For example, say you measured blood pressure from the same people every day for a month, and then you wanted to plot the average blood pressure by condition (where maybe the "condition" variable is the diet they are on). tsplot could do this, with a call that would look something like sns.tsplot(df, time="day", unit="person", condition="diet", value="blood_pressure")

    That scenario is different from having large groups of people on different diets and each day randomly sampling some from each group and measuring their blood pressure. From the example you gave, it seems like your data are structured like the this.

    However, it's not that hard to come up with a mix of matplotlib and pandas that will do what I think you want:

    # Read in the data from the stackoverflow question
    df = pd.read_clipboard().iloc[1:]
    
    # Convert it to "long-form" or "tidy" representation
    df = pd.melt(df, id_vars=["date"], var_name="condition")
    
    # Plot the average value by condition and date
    ax = df.groupby(["condition", "date"]).mean().unstack("condition").plot()
    
    # Get a reference to the x-points corresponding to the dates and the the colors
    x = np.arange(len(df.date.unique()))
    palette = sns.color_palette()
    
    # Calculate the 25th and 75th percentiles of the data
    # and plot a translucent band between them
    for cond, cond_df in df.groupby("condition"):
        low = cond_df.groupby("date").value.apply(np.percentile, 25)
        high = cond_df.groupby("date").value.apply(np.percentile, 75)
        ax.fill_between(x, low, high, alpha=.2, color=palette.pop(0))
    

    This code produces:

    enter image description here

    0 讨论(0)
提交回复
热议问题