问题
I am trying to add a slider to my choropleth plot.
The slider is based on "years" from 2006 to 2012
My data is like this:
It can be downloaded from here:
sample_data.csv
When I plot the county level choropleth, It is doing an inner join w.r.t fips code for county using transform_lookup
This is my code:
slider = alt.binding_range(min=2006, max=2012, step=1)
select_year = alt.selection_single(name="year", fields=['year'],
bind=slider, init={'year': 2006})
alt.Chart(us_counties).mark_geoshape(
stroke='black',
strokeWidth=0.05
).project(
type='albersUsa'
).transform_lookup(
lookup='id',
from_=alt.LookupData(fdf, 'fips', ['Pill_per_pop','year'])
).transform_calculate(
Pill_per_pop='isValid(datum.Pill_per_pop) ? datum.Pill_per_pop : -1'
).encode(
color = alt.condition(
'datum.Pill_per_pop > 0',
alt.Color('Pill_per_pop:Q', scale=Scale(scheme='blues')),
alt.value('#dbe9f6')
)).add_selection(
select_year
).properties(
width=700,
height=400
).transform_filter(
select_year
)
This code is giving me a choropleth plot with a slider but the plots are wrong.
I feel that it is taking the first occurence for fips code and is not filtering based on the year.
I feel this is because of the transform_lookup method mapping the county ids to fips code.
This is the output:
回答1:
You are correct that the lookup transform only finds the first matching index - it's a one-sided join, not a multi-join. If you want to join multiple data entries per key, you'll have to use a dataset with multiple columns.
For your data, you can produce such a dataset using the pandas pivot method, and then within Altair you can undo this operation after the lookup by using the fold transform
For your data it might look something like this:
import altair as alt
import pandas as pd
from vega_datasets import data
us_counties = alt.topo_feature(data.us_10m.url, 'counties')
fdf = pd.read_csv('https://raw.githubusercontent.com/sdasara95/Opioid-Crisis/master/sample_data.csv')
fdf['year'] = fdf['year'].astype(str)
fdf = fdf.pivot(index='fips', columns='year', values='Pill_per_pop').reset_index()
columns = [str(year) for year in range(2006, 2013)]
slider = alt.binding_range(min=2006, max=2012, step=1)
select_year = alt.selection_single(name="year", fields=['year'],
bind=slider, init={'year': 2006})
alt.Chart(us_counties).mark_geoshape(
stroke='black',
strokeWidth=0.05
).project(
type='albersUsa'
).transform_lookup(
lookup='id',
from_=alt.LookupData(fdf, 'fips', columns)
).transform_fold(
columns, as_=['year', 'Pill_per_pop']
).transform_calculate(
year='parseInt(datum.year)',
Pill_per_pop='isValid(datum.Pill_per_pop) ? datum.Pill_per_pop : -1'
).encode(
color = alt.condition(
'datum.Pill_per_pop > 0',
alt.Color('Pill_per_pop:Q', scale=alt.Scale(scheme='blues')),
alt.value('#dbe9f6')
)).add_selection(
select_year
).properties(
width=700,
height=400
).transform_filter(
select_year
)
来源:https://stackoverflow.com/questions/59224026/how-to-add-a-slider-to-a-choropleth-in-altair