dc.js Using two reducers without a simple dimension and second grouping stage

前端 未结 1 1885
無奈伤痛
無奈伤痛 2021-01-15 05:37

Quick question following up my response from this post: dc.js Box plot reducer using two groups Just trying to fully get my head around reducers and how to filter and colle

相关标签:
1条回答
  • 2021-01-15 06:14

    You don't need to do all of that to get a pie chart of mac addresses.

    There are a few faulty understandings in points 1-3, which I guess I'll address first. It looks like you copy and pasted code from the previous question, so I'm not really sure if this helps.

    (1) If you have a dimension of mac addresses, reducing it like this won't have any further effect. The original idea was to dimension/group by vendor and then reduce counts for each mac address. This reduction will group by mac address and then further count instances of each mac address within each bin, so it's just an object with one key. It will produce a map of key value pairs like

    {key: 'MAC-123', value: {'MAC-123': 12}}
    

    (2) This will flatten the object within the values, dropping the keys and producing just an array of counts

    {key: 'MAC-123', value: [12]}
    

    (3) Since the pie chart is expecting simple key/value pairs with the value being a number, it is probably unhappy with getting values like the array [12]. The values are probably coerced to NaN.

    (4) Okay, here's the real question, and it's actually not as easy as your previous question. We got off easy with the box plot because the "dimension" (in crossfilter terms, the keys you filter and group on) existed in your data.

    Let's forget the false lead in points 1-3 above, and start from first principles.

    There is no way to look at an individual row of your data and determine, without looking at anything else, if it belongs to the category "has 1 connection", "has 2 connections", etc. Assuming you want to be able to click on slices in the pie chart and filter all the data, we'll have to find another way to implement that.

    But first let's look at how to produce a pie chart of "number of network connections". That's a little bit easier, but as far as I know, it does require a true "double reduce".

    If we use the default reduction on the mac dimension, we'll get an array of key/value pairs, where the key is a mac address, and the value is the number of connections for that address:

    [
      {
        "key": "1c:b7:2c:48",
        "value": 8
      },
      {
        "key": "1c:b7:be:ef",
        "value": 3
      },
      {
        "key": "6c:17:79:03",
        "value": 2
      },
      ...
    

    How do we now produce a key/value array where the key is number of connections, and the value is the array of mac addresses for that number of connections?

    Sounds like a job for the lesser-known Array.reduce. This function is the likely inspiration for crossfilter's group.reduce(), but it's a bit simpler: it just walks through an array, combining each value with the result of the last. It's great for producing an object from an array:

    var value_keys = macPacketGroup.all().reduce(function(p, kv) {
      if(!p[kv.value])
        p[kv.value] = [];
      p[kv.value].push(kv.key);
      return p;
    }, {});
    

    Great:

    {
      "1": [
        "b8:1d:ab:d1",
        "dc:d9:16:3a",
        "dc:d9:16:3b"
      ],
      "2": [
        "6c:17:79:03",
        "6c:27:79:04",
        "b8:1d:aa:d1",
        "b8:1d:aa:d2",
        "dc:da:16:3d"
      ],
    

    But we wanted an array of key/value pairs, not an object!

    var key_count_value_macs = Object.keys(value_keys)
        .map(k => ({key: k, value: value_keys[k]}));
    

    Great, that looks just like what a "real group" would produce:

    [
      {
        "key": "1",
        "value": [
          "b8:1d:ab:d1",
          "dc:d9:16:3a",
          "dc:d9:16:3b"
        ]
      },
      {
        "key": "2",
        "value": [
          "6c:17:79:03",
          "6c:27:79:04",
          "b8:1d:aa:d1",
          "b8:1d:aa:d2",
          "dc:da:16:3d"
        ]
      },
      ...
    

    Wrapping all that in a "fake group", which when asked to produce .all(), queries the original group and does the above transformations:

    function value_keys_group(group) {
      return {
        all: function() {
          var value_keys = group.all().reduce(function(p, kv) {
            if(!p[kv.value])
              p[kv.value] = [];
            p[kv.value].push(kv.key);
            return p;
          }, {});
          return Object.keys(value_keys)
            .map(k => ({key: k, value: value_keys[k]}));
        }
      }
    }
    

    Now we can plot the pie chart! The only fancy thing here is that the value accessor should look at the length of the array for each value (instead of assuming the value is just a number):

    packetPie
        // ...
        .group(value_keys_group(macPacketGroup))
        .valueAccessor(kv => kv.value.length);
    

    Demo fiddle.

    However, clicking on slices won't work. I'll return to that in a minute - just want to hit "save" first!

    Part 2: Filtering based on counts

    As I remarked at the start, it's not possible to create a crossfilter dimension which will filter based on the count of connections. This is because crossfilter always needs to look at each row and determine, based only on the information in that row, whether it belongs in a group or filter.

    If you add another chart at this point and try clicking on a slice, everything in the other charts will disappear. This is because the keys are now counts, and counts are invalid mac addresses, so we're telling it to filter to a key which doesn't exist.

    However, we can obviously filter by mac address, and we also know the mac addresses for each count! So this isn't so bad. It just requires a filterHandler.

    Although, hmmm, in producing the fake group, we seem to have forgotten value_keys. It's hidden away inside the function, and then let go.

    It's a little ugly, but we can fix that:

    function value_keys_group(group) {
      var saved_value_keys;
      return {
        all: function() {
          var value_keys = group.all().reduce(function(p, kv) {
            if(!p[kv.value])
              p[kv.value] = [];
            p[kv.value].push(kv.key);
            return p;
          }, {});
          saved_value_keys = value_keys;
          return Object.keys(value_keys)
            .map(k => ({key: k, value: value_keys[k]}));
        },
        value_keys: function() {
          return saved_value_keys;
        }
      }
    }
    

    Now, every time .all() is called (every time the pie chart is drawn), the fake group will stash away the value_keys object. Not a great practice (.value_keys() would return undefined if you called it before .all()), but safe based on the way dc.js works.

    With that out of the way, the filterHandler for the pie chart is relatively simple:

    packetPie.filterHandler(function(dimension, filters) {
      if(filters.length === 0)
        dimension.filter(null);
      else {
        var value_keys = packetPie.group().value_keys();
        var all_macs = filters.reduce(
          (p, v) => p.concat(value_keys[v]), []);
        dimension.filterFunction(k => all_macs.indexOf(k) !== -1);
      }
      return filters;
    });
    

    The interesting line here is another call to Array.reduce. This function is also useful for producing an array from another array, and here we use it just to concatenate all of the values (mac addresses) from all of the selected slices (connection counts).

    Now we have a working filter. It doesn't make too much sense to combine it with the box plot from the last question, but the new fiddle demonstrates that filtering based on number of connections does work.

    Part 3: what about zeroes?

    As commonly comes up, crossfilter considers a bin with value zero to still exist, so we need to "remove the empty bins". However, in this case, we've added a non-standard method to the first fake group, in order to allow filtering. (We could have just used a global there, but globals are messy.)

    So, we need to "pass through" the value_keys method:

    function remove_empty_bins_pt(source_group) {
        return {
            all:function () {
                return source_group.all().filter(function(d) {
                    return d.key !== '0';
                });
            },
            value_keys: function() {
                return source_group.value_keys();
            }
        };
    }
    packetPie
      .group(remove_empty_bins_pt(value_keys_group(macPacketGroup)))
    

    Another oddity here is we are filtering out the key zero, and that's a string here!

    Demo fiddle!

    Alternately, here's a better solution! Do the bin filtering before passing to value_keys_group, and then we can use the ordinary remove_empty_bins!

    function remove_empty_bins(source_group) {
        return {
            all:function () {
                return source_group.all().filter(function(d) {
                    //return Math.abs(d.value) > 0.00001; // if using floating-point numbers
                    return d.value !== 0; // if integers only
                });
            }
        };
    }
    packetPie
        .group(value_keys_group(remove_empty_bins(macPacketGroup)))
    

    Yet another demo fiddle!!

    0 讨论(0)
提交回复
热议问题