Load large dataset into crossfilter/dc.js

前端 未结 3 628
暖寄归人
暖寄归人 2021-02-01 10:09

I built a crossfilter with several dimensions and groups to display the data visually using dc.js. The data visualized is bike trip data, and each trip will be loaded in. Right

3条回答
  •  独厮守ぢ
    2021-02-01 10:41

    Consider my class design. It doesn't match yours but it illustrates my points.

    public class MyDataModel
    {
        public List Data { get; set; }
    }
    
    public class MyDatum
    {
        public long StartDate { get; set; }
        public long EndDate { get; set; }
        public int Duration { get; set; }
        public string Title { get; set; }
    }
    

    The start and end dates are Unix timestamps and the duration is in seconds.

    Serializes to: "{"Data":
    [{"StartDate":1441256019,"EndDate":1441257181, "Duration":451,"Title":"Rad is a cool word."}, ...]}"

    One row of datum is 92 chars.

    Let's start compressing! Convert dates and times to base 60 strings. Store everything in an array of an array of strings.

    public class MyDataModel
    {
        public List> Data { get; set; }
    }
    

    Serializes to: "{"Data":[["1pCSrd","1pCTD1","7V","Rad is a cool word."],...]}"

    One row of datum is now 47 chars. moment.js is a good library for working with dates and time. It has functions built in to unpack the base 60 format.

    Working with an array of arrays will make your code less readable so add comments to document the code.

    Load just the most recent 90 days. Zoom to 30 days. When the user drags the brush on the range chart left start fetching more data in chunks of 90 days until the user stops dragging. Add the data to the existing crossfilter using the add method.

    As you add more and more data you will notice that your charts get less and less responsive. That is because you have rendered hundreds or even thousands of elements in your svg. The browser is getting crushed. Use the d3 quantize function to group data points into buckets. Reduce the displayed data to 50 buckets.

    Quantizing is worth the effort and is the only way you can create a scalable graph with a continuously growing dataset.

    Your other option is to abandon the range chart and group the data month over month, day over day, and hour over hour. Then add a date range picker. Since your data would be grouped by month, day, and hour you'll find that even if you rode your bike every hour of the day you'd never have a result set larger than 8766 rows.

提交回复
热议问题