Grouped line plots in Plotly R: how to control line color?

后端 未结 1 465
难免孤独
难免孤独 2020-12-22 02:40

I have a bunch of \'paired\' observations from a study for the same subject, and I am trying to build a spaghetti plot to visualize these observations as follows:

         


        
1条回答
  •  生来不讨喜
    2020-12-22 03:02

    You can set the lines to the same colour like this

    plot_ly(df, x = type, y = values, group = id, type = 'scatter', mode = 'lines+markers', 
            line=list(color='#000000'), showlegend = FALSE)
    

    For the 'bonus' two-for-the-price-of-one question 'how to color by a different variable to the one used for grouping':

    If you were only plotting markers, and no lines, this would be simple, as you can simply provide a vector of colours to marker.color. Unfortunately, however, line.color only takes a single value, not a vector, so we need to work around this limitation.

    Provided the data are not too numerous (in which case this method becomes slow, and a faster method is given below), you can set colours of each line individually by adding them as separate traces one by one in a loop (looping over id)

    p <- plot_ly()
    for (id in df$id) {
      col <- c('#AA0000','#0000AA')[df[which(df$id==id),3][1]+1] # calculate color for this line based on the 3rd column of df (df$state).
      p <- add_trace(data=df[which(df$id==id),], x=type, y=values, type='scatter', mode='markers+lines',
                     marker=list(color=col),
                     line=list(color=col), 
                     showlegend = FALSE,
                     evaluate=T)
      }
    p
    

    Although this one-trace-per-line approach is probably the simplest way conceptually, it does become very (impractically) slow if applied to hundreds or thousands of line segments. In this case there is a faster method, which is to plot only one line per colour, but to split this line up into multiple segments by inserting NA's between the separate segments and using the connectgaps=FALSE option to break the line into segments where there are missing data.

    Begin by using dplyr to insert missing values between line segements (i.e. for each unique id we add a row containing NA in the columns that provide x and y coordinates).

    library(dplyr)
    df %<>% distinct(id) %>%
      `[<-`(,c(2,4),NA) %>%
      rbind(df) %>%
      arrange (id)
    

    and plot, using connectgaps=FALSE:

    plot_ly(df, x = type, y = values, group = state, type = 'scatter', mode = 'lines+markers', 
            showlegend = FALSE,
            connectgaps=FALSE)
    

    0 讨论(0)
提交回复
热议问题