Can someone explain in simple terms how reduce function with its arguments reduceAdd
, reduceSum
, reduceRemove
works in crossfilter
Remember that map reduce reduces a dataset by keys of a particular dimension. For example lets use a crossfilter instance with records:
[
{ name: "Gates", age: 57, worth: 72000000000, gender: "m" },
{ name: "Buffet", age: 59, worth: 58000000000, gender: "m" },
{ name: "Winfrey", age: 83, worth: 2900000000, gender: "f" },
{ name: "Bloomberg", age: 71, worth: 31000000000, gender: "m" },
{ name: "Walton", age: 64, worth: 33000000000, gender: "f" },
]
and dimensions name, age, worth, and gender. We will reduce the gender dimension using the reduce method.
First we define the reduceAdd, reduceRemove, and reduceInitial callback methods.
reduceInitial
returns an object with the form of the reduced object and the initial values. It takes no parameters.
function reduceInitial() {
return {
worth: 0,
count: 0
};
}
reduceAdd
defines what happens when a record is being 'filtered into' the reduced object for a particular key. The first parameter is a transient instance of the reduced object. The second object is the current record. The method will return the augmented transient reduced object.
function reduceAdd(p, v) {
p.worth = p.worth + v.worth;
p.count = p.count + 1;
return p;
}
reduceRemove
does the opposite of reduceAdd
(at least in this example). It takes the same parameters as reduceAdd
. It is needed because group reduces are updated as records are filtered and sometimes records need to be removed from a previously computed group reduction.
function reduceRemove(p, v) {
p.worth = p.worth - v.worth;
p.count = p.count - 1;
return p;
}
Invoking the reduce method would look like this:
mycf.dimensions.gender.reduce(reduceAdd, reduceRemove, reduceInitial)
To take a peek at the reduced values, use the all
method. To see the top n values use the top(n)
method.
mycf.dimensions.gender.reduce(reduceAdd, reduceRemove, reduceInitial).all()
The returned array would (should) look like:
[
{ key: "m", value: { worth: 161000000000, count: 3 } },
{ key: "f", value: { worth: 35000000000, count: 2 } },
]
The goals of reducing a dataset is to derive a new dataset by first grouping records by common keys, then reducing a dimension those groupings into a single value for each key. In this case we grouped by gender and reduced the worth dimension of that grouping by adding the values of records that shared the same key.
The other reduceX methods are convience methods for the reduce method.
For this example reduceSum
would be the most appropriate replacement.
mycf.dimensions.gender.reduceSum(function(d) {
return d.worth;
});
Invoking all
on the returned grouping would (should) look like:
[
{ key: "m", value: 161000000000 },
{ key: "f", value: 35000000000 },
]
reduceCount
will count records
mycf.dimensions.gender.reduceCount();
Invoking all
on the returned grouping would (should) look like:
[
{ key: "m", value: 3 },
{ key: "f", value: 2 },
]
Hope this helps :)
Source: https://github.com/square/crossfilter/wiki/API-Reference
http://blog.rusty.io/2012/09/17/crossfilter-tutorial/
var livingThings = crossfilter([
// Fact data.
{ name: “Rusty”, type: “human”, legs: 2 },
{ name: “Alex”, type: “human”, legs: 2 },
{ name: “Lassie”, type: “dog”, legs: 4 },
{ name: “Spot”, type: “dog”, legs: 4 },
{ name: “Polly”, type: “bird”, legs: 2 },
{ name: “Fiona”, type: “plant”, legs: 0 }
]);
For example, how many living things are in my house?
To do this, we’ll call the groupAll
convenience function, which selects all
records into a single group, and then the reduceCount
function, which
creates a count of the records.
// How many living things are in my house?
var n = livingThings.groupAll().reduceCount().value();
console.log("There are " + n + " living things in my house.") // 6
Now let’s get a count of all the legs in my house. Again, we’ll use the groupAll
function to get all records in a single group, but then we call the
reduceSum
function. This is going to sum values together. What values?
Well, we want legs, so let’s pass a function that extracts and returns the number of legs from the fact.
// How many total legs are in my house?
var legs = livingThings.groupAll().reduceSum(function(fact) {
return fact.legs;
}).value()
console.log("There are " + legs + " legs in my house.")
reduceCount
function creates a count of the records.
reduceSum
function is the sum values of these records.