Sorting based on the alt.Color field in Altair

做~自己de王妃 提交于 2021-01-27 19:50:44

问题


I am attempting to sort a horizontal barchart based on the group to which it belongs. I have included the dataframe, code that I thought would get me to group-wise sorting, and image. The chart is currently sorted according to the species column in alphabetical order, but I would like it sorted by the group so that all "bads" are together, similarly, all "goods" are together. Ideally, I would like to take it one step further so that the goods and bads are subsequently sorted by value of 'LDA Score', but that was the next step.

Dataframe:
Unnamed: 0,Species,Unknown,group,LDA Score,p value
11,a,3.474929757,bad,3.07502591,5.67e-05
16,b,3.109308852,bad,2.739744898,0.000651725
31,c,3.16979865,bad,2.697247855,0.03310557
38,d,0.06730106400000001,bad,2.347746497,0.013009626000000002
56,e,2.788383183,good,2.223874347,0.0027407140000000004
65,f,2.644346144,bad,2.311106698,0.00541244
67,g,3.626001112,good,2.980960068,0.038597163
74,h,3.132399759,good,2.849798377,0.007021518000000001
117,i,3.192113412,good,2.861299028,8.19e-06
124,j,0.6140430960000001,bad,2.221483531,0.0022149739999999998
147,k,2.873671544,bad,2.390164757,0.002270102
184,l,3.003479213,bad,2.667274876,0.008129727
188,m,2.46344998,good,2.182085465,0.001657861
256,n,0.048663767,bad,2.952260299,0.013009626000000002
285,o,2.783848855,good,2.387345098,0.00092491
286,p,3.636219,good,3.094047,0.001584756

The code:

bars = alt.Chart(df).mark_bar().encode(
    alt.X('LDA Score:Q'),
    alt.Y("Species:N"),
    alt.Color('group:N', sort=alt.EncodingSortField(field="Clinical group", op='distinct', order='ascending'))
)

bars

The resulting figure:


回答1:


Two things:

  • If you want to sort the y-axis, you should put the sort expression in the y encoding. Above, you are sorting the color labels in the legend.
  • Sorting by field in Vega-Lite only works for numeric data (Edit: this is incorrect; see below), so you can use a calculate transform to map the entries to numbers by which to sort.

The result might look something like this:

alt.Chart(df).transform_calculate(
    order='datum.group == "bad" ? 0 : 1'  
).mark_bar().encode(
    alt.X('LDA Score:Q'),
    alt.Y("Species:N", sort=alt.SortField('order')),
    alt.Color('group:N')
)


Edit: it turns out the reason sorting by group fails is that the default operation for sort fields is sum, which only works well on quantitative data. If you choose a different operation, you can sort on nominal data directly. For example, this shows the correct output:

alt.Chart(df).mark_bar().encode(
    alt.X('LDA Score:Q'),
    alt.Y("Species:N", sort=alt.EncodingSortField('group', op='min')),
    alt.Color('group:N')
)

See vega/vega-lite#6064 for more information.



来源:https://stackoverflow.com/questions/60626992/sorting-based-on-the-alt-color-field-in-altair

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!