问题
I'm creating a line based time series graph using the plotly library for python. I'd like to connect it to a time series database, but for now I've been testing with csv data.
Is it possible to have an x
and y
axis (time vs value), and load multiple lines from another csv column value (host) and append to the x and y graph?
import pandas as pd
import plotly.express as px
df = pd.read_csv('stats.csv')
fig = px.line(df, x = 'time', y = 'connections', title='connections')
fig.show()
I'd like to define more than one line on the same graph with a particular csv host column value, so that each line is defined by anything in the host
column, and uses the time
vs connections
axis. Can the px.line
method work for that use case, or should I be looking at another method?
回答1:
With plotly it shouldn't matter whether your sources are database connections or csv files. You'll most likely handle that part through pandas dataframes either way. But since you're talking about databases, I'm going to show you how you can easily build a plotly chart on a dataset with a typical database structure where you often have to rely on grouping and subsetting of the data in order to show changes over time for different subcategories of your data. Plotly express has got a few interesting datasets try(dir(px.data)
), like the gapminder dataset:
country continent year lifeExp pop gdpPercap iso_alpha iso_num
0 Afghanistan Asia 1952 28.801 8425333 779.445314 AFG 4
1 Afghanistan Asia 1957 30.332 9240934 820.853030 AFG 4
2 Afghanistan Asia 1962 31.997 10267083 853.100710 AFG 4
3 Afghanistan Asia 1967 34.020 11537966 836.197138 AFG 4
4 Afghanistan Asia 1972 36.088 13079460 739.981106 AFG 4
If you use the correct approach, you can easily use px.line()
to build a figure on such a dataset and let the figure function take care of the grouping for you. And even use the same function to add data to that figure later. The following figures beelow are built using a combination of px.line()
, go.Figure()
and add_traces
Plot 1: A figure using px.line()
This plot shows the five countries with the highset gross domestic product per capita on the European continent. The data is grouped using arguments like color='country'
.
Plot 2: Added data to the same figure
This plot adds the five countries with the highest gross domestic product per capita on the american continent to the first plot. This triggers the need to discern the data in one more way to make it possible to see wheter the data is european or american. This is handled using the argument line_dash='country'
so that all new data compared to the original plot have dashed lines.
Tihs is only one way to do it. If the end result is what you're looking for, we can discuss other approaches as well.
Complete code:
import plotly.graph_objs as go
import plotly.express as px
import pandas as pd
# Data
gapminder = px.data.gapminder()
# Most productive european countries (as of 2007)
df_eur = gapminder[gapminder['continent']=='Europe']
df_eur_2007 = df_eur[df_eur['year']==2007]
eur_gdp_top5=df_eur_2007.nlargest(5, 'gdpPercap')['country'].tolist()
df_eur_gdp_top5 = df_eur[df_eur['country'].isin(eur_gdp_top5)]
# Most productive countries on the american continent (as of 2007)
df_ame = gapminder[gapminder['continent']=='Americas']
df_ame_2007 = df_ame[df_ame['year']==2007]
df_ame_top5=df_ame_2007.nlargest(5, 'gdpPercap')['country'].tolist()
df_ame_gdp_top5 = df_ame[df_ame['country'].isin(df_ame_top5)]
# Plotly figure 1
fig = px.line(df_eur_gdp_top5, x='year', y='gdpPercap',
color="country",
line_group="country", hover_name="country")
fig.update_layout(title='Productivity, Europe' , showlegend=False)
# Plotly figure 2
fig2 = go.Figure(fig.add_traces(
data=px.line(df_ame_gdp_top5, x='year', y='gdpPercap',
color="country",
line_group="country", line_dash='country', hover_name="country")._data))
fig2.update_layout(title='Productivity, Europe and America', showlegend=False)
#fig.show()
fig2.show()
来源:https://stackoverflow.com/questions/59762321/how-do-i-add-and-define-multiple-lines-in-a-plotly-time-series-chart