How to handle layers with missing data points in d3.layout.stack()

后端 未结 3 1860
悲&欢浪女
悲&欢浪女 2021-02-04 00:47

I\'m using d3.stack to create a stacked area chart but I get an error if I have don\'t have an equal number of items in each layer. I\'m starting with an array of data like this

相关标签:
3条回答
  • 2021-02-04 00:54

    This isn't d3 specific but rather a general solution for filling in the gaps in a array of keyed data. I modified your jsfiddle here with the following function:

    function assignDefaultValues( dataset )
    {
        var defaultValue = 0;
        var keys = [ 'Group1' , 'Group2', 'Group3' ];
        var hadData = [ true, true, true];
        var newData = [];
        var previousdate = new Date();
        var sortByDate = function(a,b){ return a.date > b.date ? 1 : -1; };
    
        dataset.sort(sortByDate);
        dataset.forEach(function(row){
            if(row.date.valueOf() !== previousdate.valueOf()){
                for(var i = 0 ; i < keys.length ; ++i){
                    if(hadData[i] === false){
                        newData.push( { key: keys[i], 
                                       value: defaultValue, 
                                       date: previousdate });
                    }
                    hadData[i] = false;
                }
                previousdate = row.date;
            }
            hadData[keys.indexOf(row.key)] = true; 
        });
        for( i = 0 ; i < keys.length ; ++i){
            if(hadData[i] === false){
                newData.push( { key: keys[i], value: defaultValue, 
                                date: previousdate });
            }
        }
        return dataset.concat(newData).sort(sortByDate);
    }
    

    It walks through the given dataset and, whenever it comes across a new date value, assigns a default value to any keys that have not yet been seen.

    0 讨论(0)
  • 2021-02-04 01:03

    Stack does really what it says, stacking graphs, so you as user are responsible for providing the data in the correct format. This makes sense if you think about it, because stack is basically data format agnostic. It provides a great deal of flexibility, with the only restriction that for each layer it can access the same number of points. How would it determine which points are missing? Given that the first layer had five points and the second layer has ten points, is the first layer missing five points? Or are both layer missing points because a third layer contains even more points. And then if points are missing, which ones? At the beginning, at the end, somewhere in the middle? Again there is no sensible way for a stack implementation to figure this out (unless it would force very rigid data structures).

    So, but is there nothing you can do? I think you can. I can't give you a full implementation but can give you some pointers in the right direction. We start here:

    var stack = d3.layout.stack()
      .offset("zero")
      .values(function(d) { return d.values; })
    

    Here you just return the values, which in your example will be the result of the nest operator. So at this point you have the ability to "fix" the values.

    The first thing you need to do is determining the maximum number of observations.

    var nested = nest.entries(data);
    var max = nested.reduce(function(prev, cur) {
      return Math.max(prev, cur.values.length);
    }, 0);
    

    Now the tricky part. Once you know the maximum number of elements, you'll need to adjust the function that is passed to values. Here you'll have to make assumptions on the data. From you question I understand that for some groups values are missing. So there are two possibilities. Either you assume that the group with the maximum number of elements contains all items in the range or you assume a certain range and check all groups if they contain values for each "tick" in your range. So if your range is a date range (as in your example) and you expect for every day (or what ever interval for that matter) a measurement, you'll have to walk the items in the group and fill the gaps yourself. I'll try to give an (untested) example for a numerical range:

    // define some calculated values that can be reused in correctedValues
    var range = [0, 1];
    var step = 0.1;
    
    function correctedValues(d) {
      var values = d.values;
      var result = [];
      var expected = 0;
      for (var i = 0; i < values.length; ++i) {
         var value = values[i];
         // Add null-entries
         while (value.x > expected) {
           result.push({x: expected, otherproperties_you_need... });
           expected += step;
         }
         result.push(value); // Now add the real data point.
         expected = value.x;
      }
    
      // Fill up the end of of the array if needed
      while(expected < range[1]) {
        result.push({x: expected, otherproperties_you_need... });
        expected += step;
      }
      return result;
    }
    
    // Now use our costom function for the stack
    var stack = d3.layout.stack()
     .offset("zero")
     .values(correctedValues)
    ...
    

    As said, this part is untested and not directly solving your problem (as I'm using a numerical range) but I think it should give you an idea on how to solve your problem (and what the actual source of your problem is).

    0 讨论(0)
  • 2021-02-04 01:14

    As others have explained, it would be unreasonable for the stacked chart to guess at the missing values for each data point, because there are so many ways to interpolate the values and there is no obvious choice.

    However, d3.svg.line() seems to offer a reasonable way for you to pick your own method of interpolation and fill in missing values. While it's designed for generating SVG paths, you can probably adapt it for defining lines in general. Interpolation methods are suggested here:

    https://github.com/mbostock/d3/wiki/SVG-Shapes#wiki-line_interpolate

    It's unfortunate that the class, for now, has all these wonderful interpolation methods (that don't appear anywhere else in d3) but is restricted to generating SVG path data instead of arbitrary intermediate values. Perhaps if @mbostock sees this, he will consider generalizing the functionality.

    However, for now you may just want to make a fork of d3 and take the intermediate result of line(data) before it is written to a SVG path string, in the part of the source that does the interpolation, below:

      function line(data) {
        var segments = [],
            points = [],
            i = -1,
            n = data.length,
            d,
            fx = d3_functor(x),
            fy = d3_functor(y);
    
        function segment() {
          segments.push("M", interpolate(projection(points), tension));
        }
    
        while (++i < n) {
          if (defined.call(this, d = data[i], i)) {
            points.push([+fx.call(this, d, i), +fy.call(this, d, i)]);
          } else if (points.length) {
            segment();
            points = [];
          }
        }
    
        if (points.length) segment();
    
        return segments.length ? segments.join("") : null;
      }
    
    0 讨论(0)
提交回复
热议问题