Plotting binned correlation of two variables using common axis

Deadly 提交于 2021-01-28 11:40:21

问题


I have three lists that I have loaded into a pandas dataframe.

import pandas as pd
df = pd.DataFrame({'x': location})
df = df.assign(y1 = variable1)
df = df.assign(y2 = variable2)

I would like to plot the correlation of y1 with y2 with x being the common x-axis. That is, really, I would like to bin y1 and y2 values according to x location, find the correlation of y1 with y2 within each bin and then plot a line of the correlations across the whole x domain. So my final plot will have correlation on the y-axis and location on the x-axis.

I have previously done something not completely dissimilar to this using the scipy binned_statistics function to plot conditional means but I don't think I can easily extend that to correlations. I would also like to get a bit better at using pandas anyway so I'm trying to avoid that route if at all possible.

I'm sure this has been asked before but everything that I have come across seems to be looking at multiple distribution plots.


回答1:


I've more or less arrived at a solution. Implementing something similar to what was used here I have:

nbins = 20
df['bins'] = pd.qcut(df['x'], q=nbins)
plotdatadf = df.groupby('bins')[['y1', 'y2']].corr().iloc[0::2, -1]

This provides me with a data frame with a correlation coefficient of y1 and y2 for each bin, where bins are evenly divided along x in terms of observations per bin.

I can now go back to my previous dataframe and add another column of the original length with these correlation values, conditional on if bin[1] then corr = corr[1]-type copying. This column can then be plotted as y against my already existing x as a line plot.



来源:https://stackoverflow.com/questions/64019645/plotting-binned-correlation-of-two-variables-using-common-axis

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!