1)As we know, there\'s no side-effect with map() and reduce(). Nowadays, we also have muti-core on cell phone. So is it more efficient to use them?
2)On the other h
I just tested this today, using map and reduce over floating point numbers, with the latest node.js version, and the answer is that map and reduce was two orders of magnitude slower than an regular for loop.
var r = array.map(x => x*x).reduce( (total,num) => total+num,0);
~11,000ms
var r = 0.0;
array.forEach( (x,i,a) => r += x*x )
~300ms
var r = 0.0;
for (var j = 0; j < array.length;j++){
var x = array[j];
r += x*x;
}
~35ms
EDIT: One should note that this difference is much less in Firefox, and may be much less in future version of Node / Chrome as well.
As long as the dimensions of the array are very low ( in order of 10's) then there is not much a difference in the performance, but when the size of the array increases to a very large value, then using conventional for loop is a great and better method, because we just need to loop through the elements and get a value at a specific index as we point, but in other methods, not only we are getting the value at an index, but also additional information such as the index, (in map, reduce, forEach) and accumulator value (in reduce). And these methods as need an a callback function, they both take up a great deal of stock on the memory for callbacks which further reduces the performance speed.
You can check the justification of this with the following script. Just see the values that are console logged.
var scripts=[];
// GLOBAL variales declaration
var a=[];
function preload() {
for(var i=0;i<100000;i++) a[i]=i;
}
preload();
// TEST function 0
scripts.push(function() {
var sum=0;
a.forEach(function(v) {
sum+=v;
});
//console.log(sum);
});
// TEST function 1
scripts.push(function() {
a.reduce(function(acc,v) {
return acc+v;
});
});
// TEST function 2
scripts.push(function() {
var sum=0;
for(var i=0;i<a.length;i++) {
sum+=a[i];
}
});
// EVALUATION
scripts.forEach(function(f,index) {
var date=new Date();
for(var i=0;i<10000;i++) {
f();
}
console.log("call "+index+" "+(new Date()-date));
});
It is easily overlooked, but the key to getting the benefits of MapReduce is to
A) Exploit the optimized shuffle. Often, your map and reduce functions can be implemented in a slow language, as long as the shuffle - the most expensive operation - is well optimized, it will still be fast and scalable.
B) Exploit the checkpointing functionality to recover from node failures (but hopefully, your CPU cores won't fail).
So in the end, map-reduce is actually neither about the map, nor the reduce functions. It's about the framework around it; which will give you good performance even with bad "map" and "reduce" functions (unless you lose control over your data set size in the shuffle step!).
The gains to be obtained from doing a multi-threaded map-reduce on a single node are fairly low, and most likely there are much better ways of parallelizing your shop for shared memory architectures than map-reduce...
Unfortunately, there is a lot of hype (and too little understanding) surrounding mapreduce these days. If you look up the original paper, it goes into detail about "Backup Tasks", "Machine Failures" and "locality optimization" (neither of which makes sense for an in-memory single-host use case).
Just because it has a "map" and a "reduce" doesn't make it a "mapreduce" yet.
It's only a MapReduce if it has an optimized shuffle, node crash and straggler recovery.
1)
there's no side-effect with map() and reduce()
Well. You very well can implement map and reduce callbacks having side effects. Nothing prevents it and in the current state of JavaScript it's not even considered as bad practice.
2)
there's only 1 thread for js to execute on most of the browsers
There's only one thread in all today's JS engines, even when they run server-side (in fact there can be more but in isolation, not accessing the same array).
So the fact there is no side effect wouldn't make array modifications parallelisable at all. No JS engine can do otherwise than call the callback sequentially on standard arrays.
Note : as pointed by zirak, there's this not standard Mozilla ParallelArray thing which could help making parallel execution. I don't know if there's something similar on V8.