问题
I'm trying to produce a box plot which will show the total number of networks single devices have connected to grouped by their vendor.
Data Format:
{
"SSID": "eduroam",
"identifier": "Client",
"latitude": 52.4505,
"longitude": -1.9361,
"mac": "dc:d9:16:##:##:##",
"packet": "PR-REQ",
"timestamp": "2018-07-10 12:25:26",
"vendor": "Huawei Technologies Co.Ltd"
}
Fiddle with data https://jsfiddle.net/v4a8g2bo/
I have managed to get a sum of the networks a single device has connected to using the following code. Data filtered before to only contain unique networks to a mac address, therefore using a counter works to count networks.
var mac = ndx.dimension(function (d) { return d["mac"]; });
var SSIDstoSingleMAC = mac.group().reduceSum(function (d) { return
+d.counter});
My problem lies when trying to then pass this grouped sum into a further group that will output an array for use in the box plot chart
var vendor = ndx.dimension(function (d) { return d["vendor"]; });
//Used to count number of networks per device
var mac = ndx.dimension(function (d) { return d["mac"]; });
var SSIDstoSingleMAC = mac.group().reduceSum(function (d) { return
+d.counter});
//This is where things fall down
var boxplotGroup = SSIDstoSingleMAC.group().reduce(
function (p, v) {
let dv = v.counter;
if (dv != null) p.push(dv);
return p;
},
function (p, v) {
let dv = v.counter;
if (dv != null) p.splice(p.indexOf(dv), 1);
return p;
},
function () {
return [];
}
);
var boxPlot = dc.boxPlot("#boxPlot");
boxPlot
.width(1200)
.height(600)
.dimension(vendor)
.group(boxplotGroup)
.tickFormat(d3.format('.1f'))
.elasticY(true)
.elasticX(true)
;
This is the goal: Ex. Apple [7,5,10,2] = four apple devices.. device one has connected to 7 networks... ect..
ATTEMPT AT HIDDEN GROUP
Gordon mentioned in the comments that two groups can't be passed recursively in crossfilter. I'm now trying to produce a hidden group that can accumulate the networks per mac address using the following code from the DC git however I can't get this to mesh up with the boxplot reducer.. Am I going in the right direction here?
https://github.com/dc-js/dc.js/wiki/FAQ#accumulate-values
var allDim = ndx.dimension(function (d) { return d; });
function accumulate_group(source_group) {
return {
all:function () {
var cumulate = 0;
return source_group.all().map(function(d) {
cumulate += d.counter;
return {key:d.mac, value:cumulate};
});
}
};
}
var boxPlotDim = accumulate_group(allDim);
var boxPlotGroup = boxPlotDim.group().reduce(
function(p,v) {
p.push(v.value());
return p;
},
function(p,v) {
p.splice(p.indexOf(v.value()), 1);
return p;
},
function() {
return [];
}
);
var boxPlot = dc.boxPlot("#boxPlot");
boxPlot
.width(1200)
.height(600)
.dimension(vendor)
.group(boxPlotGroup)
.tickFormat(d3.format('.1f'))
.elasticY(true)
.elasticX(true)
;
Thanks Adam
回答1:
Ideally we'd really like to use a simple dimension over vendors here, in case we want to filter using a brush on the boxplot.
So then the question becomes: how do we reduce twice, once to get counts per MAC address, and then again to turn those counts into an array.
The first part has a standard answer: just reduce to an object instead of a value:
var vendorMacCountsGroup = vendor.group().reduce(
function(p, v) { // add
p[v.mac] = (p[v.mac] || 0) + v.counter;
return p;
},
function(p, v) { // remove
p[v.mac] -= v.counter;
return p;
},
function() { // init
return {}; // macs;
}
);
I recently described this pattern in this answer, so I won't go into the details here.
Here's the sample output: bins are vendors, and each value is an object mapping mac addresses to counts:
[
{
"key": "Asustek Computer Inc.",
"value": {
"1c:b7:2c:48": 8,
"1c:b7:be:ef": 3
}
},
{
"key": "Huawei Technologies Co.Ltd",
"value": {
"dc:d9:16:3d": 14,
"dc:da:16:3d": 2,
"dc:d9:16:3a": 1,
"dc:d9:16:3b": 1
}
},
...
Next, we really just want the counts and to forget the MAC addresses. JavaScript has a nice built-in function for this, Object.values. We just need to apply to that to each of the object-values in our group. We'll also throw out any zeros, because that will only happen when a MAC address has been filtered out somewhere else.
function flatten_object_group(group) {
return {
all: function() {
return group.all().map(function(kv) {
return {
key: kv.key,
value: Object.values(kv.value).filter(function(v) { return v>0; })
};
});
}
};
}
var boxPlotGroup = flatten_object_group(vendorMacCountsGroup);
Sample output:
[
{
"key": "Asustek Computer Inc.",
"value": [
8,
3
]
},
{
"key": "Huawei Technologies Co.Ltd",
"value": [
14,
2,
1,
1
]
},
...
Your sample data only had one MAC address per vendor, so I added some bogus data, and got a decent-looking boxplot:
Fork of your fiddle.
Taking only the top ten by #MACs
As an example of how you might trim the data if there are too many boxes, here's how you would sort by number of MAC addresses, and take only the 10 "most popular" vendors:
function top_ten_by_length(group) {
return {
all: function() {
return group.all().sort(function(a,b) {
return b.value.length - a.value.length;
}).slice(0, 10);
}
};
}
Compose them like this:
var boxPlotGroup = top_ten_by_length(flatten_object_group(vendorMacCountsGroup));
This is off the top of my head and untested so please edit/comment if there is some glitch.
来源:https://stackoverflow.com/questions/51639127/dc-js-box-plot-reducer-using-two-groups