Plotly (Python) Subplots: Padding Facets and Sharing Legends

问题

I'm trying to create 2 plots with a shared x-axis and I'm having 2 problems with this:

As soon as I customise the layout with yaxis and yaxis2 titles and/or tickmarks, y-axes begin to overlap
I would like the legends to be shared between the 2 plots, but instead they are duplicated

Here is the code to reproduce the problem I'm experiencing:

from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True) # using jupyter
import plotly.graph_objs as go
from plotly import tools
import numpy as np

 N = 100
epoch_range = [i for i in range(N)]
model_perf = {}
for m in ['acc','loss']:
    for sub in ['train','validation']:
        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)
        model_perf[history_target] = np.random.random(N)

line_type = {
    'train': dict(
        color='grey',
        width=1,
        dash='dash'
    ),
    'validation': dict(
        color='blue',
        width=4
    )
}

fig = tools.make_subplots(rows=2, cols=1, shared_xaxes=True, shared_yaxes=False, specs = [[{'b':10000}], [{'b':10000}]])
i = 0
for m in ['acc','loss']:

    i += 1

    for sub in ['train','validation']:

        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)

        fig.append_trace({
            'x': epoch_range,
            'y': model_perf[history_target],
            #'type': 'scatter',
            'name': sub,
            'legendgroup': m,
            'yaxis': dict(title=m),
            'line': line_type[sub],
            'showlegend': True
        }, i, 1)

fig['layout'].update(
    height=600, 
    width=800, 
    xaxis = dict(title = 'Epoch'),
    yaxis = dict(title='Accuracy', tickformat=".0%"),
    yaxis2 = dict(title='Loss', tickformat=".0%"),
    title='Performance'
)
iplot(fig)

And here is the image that I get:

If you have any suggestions on how to solve these 2 problems, I'd love to hear from you.

Manny thanks in advance!

EDIT:

Following Farbice's advice, I looked into the create_facet_grid function from plotly.figure_factory (which by the way requires plotly 2.0.12+), I did manage to reproduce the same image with fewer lines but it gave me less flexibility -- for example I don't think you can plot lines using this function and it also has the legend duplication issue, but if you are looking for an ad hoc viz, this might be quite effective. It requires data in a long format, see the below example:

# converting into the long format
import pandas as pd
perf_df = (
    pd.DataFrame({
        'accuracy_train': model_perf['acc'],
        'accuracy_validation': model_perf['val_acc'],
        'loss_train': model_perf['loss'],
        'loss_validation': model_perf['val_loss']
    })
    .stack()
    .reset_index()
    .rename(columns={
        'level_0': 'epoch',
        'level_1': 'variable',
        0: 'value'
    })
)

perf_df = pd.concat(
    [
        perf_df,
        perf_df['variable']
        .str
        .extractall(r'(?P<metric>^.*)_(?P<set>.*$)')
        .reset_index()[['metric','set']]   
    ], axis=1
).drop(['variable'], axis=1)

perf_df.head() # result

epoch  value     metric     set
0      0.434349  accuracy   train
0      0.374607  accuracy   validation
0      0.864698  loss       train
0      0.007445  loss       validation
1      0.553727  accuracy   train

# plot it
fig = ff.create_facet_grid(
    perf_df,
    x='epoch',
    y='value',
    facet_row='metric',
    color_name='set',
    scales='free_y',
    ggplot2=True
)

fig['layout'].update(
    height=800, 
    width=1000, 
    yaxis1 = dict(tickformat=".0%"),
    yaxis2 = dict(tickformat=".0%"),
    title='Performance'
)

iplot(fig)

And here is the result:

回答1:

After doing a little more digging I've found the solution to both my problems.

First, the overlapping y-axis problem was caused by yaxis argument in the layout update, it had to be changed to yaxis1.

The second problem with duplications in the legend was a little trickier, but this post helped me work it out. The idea is that each trace can have a legend associated with it, so if you are plotting multiple traces, you may only want to use the legend from one of them (using the showlegend argument), but to make sure that one legend controls the toggle of multiple subplots, you can use the legendgroup parameter.

Here is the full code with the solution:

from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True) # using jupyter
import plotly.graph_objs as go
from plotly import tools
import numpy as np

N = 100
epoch_range = [i for i in range(N)]
model_perf = {}
for m in ['acc','loss']:
    for sub in ['train','validation']:
        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)

        model_perf[history_target] = np.random.random(N)

line_type = {
    'train': dict(
        color='grey',
        width=1,
        dash='dash'
    ),
    'validation': dict(
        color='blue',
        width=4
    )
}

fig = tools.make_subplots(
    rows=2, 
    cols=1, 
    shared_xaxes=True, 
    shared_yaxes=False
)

i = 0
for m in ['acc','loss']:

    i += 1

    if m == 'acc':
        legend_display = True
    else:
        legend_display = False

    for sub in ['train','validation']:

        if sub == 'train':
            history_target = m
        else:
            history_target = 'val_{}'.format(m)

        fig.append_trace({
            'x': epoch_range,
            'y': model_perf[history_target],
            'name': sub,
            'legendgroup': sub, # toggle train / test group on all subplots
            'yaxis': dict(title=m),
            'line': line_type[sub],
            'showlegend': legend_display # this is now dependent on the trace
        }, i, 1)

fig['layout'].update(
    height=600, 
    width=800, 
    xaxis = dict(title = 'Epoch'),
    yaxis1 = dict(title='Accuracy', tickformat=".0%"),
    yaxis2 = dict(title='Loss', tickformat=".0%"),
    title='Performance'
)
iplot(fig)

And here is the result:

回答2:

In my experience, visualization-tools prefer a long-format of data. You might want to adapt your data to a table with columns like:

epoch
variable : 'acc' or 'loss'
set: 'validation' or 'train'
value : the value for the given epoch/variable/set

By doing this you might find it easier to create the graph you desire by using facetting on 'variable' with the 'set'-traces having x=epoch,y=value

If you'd a coded solution, please provide some data.

Hope this was helpful.

来源：https://stackoverflow.com/questions/47932781/plotly-python-subplots-padding-facets-and-sharing-legends

标签

python

plotly