Given an array of arrays, what would be the efficient way of identifying the duplicate item?
var array = [
[
11.31866455078125,
44.53836644772605
Here's an approach that uses uniqWith(), and difference():
_.indexOf(array, _.head(_.difference(array, _.uniqWith(array, _.isEqual))));
The basic idea is:
uniqWith()
to remove the duplicates from array
.difference()
to compare array
against the duplicate-free version. This gets us an array of the duplicates.1
.However, if you need the index of the original, and not it's duplicate, we have to make some adjustments:
var duplicate = _.head(_.difference(array, _.uniqWith(array, _.isEqual)));
_.findIndex(array, _.unary(_.partial(_.isEqual, duplicate)));
We're still using uniqWith()
, and difference()
to find the duplicate
. But now, we're using findIndex() to get the index. The reason is that we need to use isEqual() to find the first position of the duplicate, not the second. We construct the predicate using partial() and unary(). The result this time is 0
.
You can just use plain javascript to do that, it's not that hard, here is my implementation
for (let i = 0; i < array.length; i++) {
for (let j = i + 1; j < array.length; j++) {
// quick elimination by comparing sub-array lengths
if (array[i].length !== array[j].length) {
continue;
}
// look for dupes
var dupe = true;
for (var k = 0; k < array[i].length; k++) {
if (array[i][k] !== array[j][k]) {
dupe = false;
break;
}
}
// if a dupe then print
if (dupe) {
console.debug("%d is a dupe", j);
}
}
}
The nice part about this implementation is that it will print you multiple times that an array at an index is a dupe for multiple dupes, you can use that fact to count your dupes in each index!
This is actually a very efficient way to do this because the inner for
loop (j
) always runs from the next position of the outer loop (i
). so you half your check count.
And here is a plunk
I don't know how to do this other than to just write the algorithm yourself. Both this answer and the other posted ones aren't very efficient but should be fine:
function findIndex(array, startingIndex, value) {
var predicate = _.partial(_.isEqual, value);
var arraySubset = array.slice(startingIndex+1);
var index = arraySubset.findIndex(predicate);
return index === -1 ? index : index+startingIndex+1;
}
function findDuplicates(array) {
return array.map((value, index) => {
return {
value,
index: findIndex(array, index, value)
};
}).filter(info => info.index !== -1);
}
findDuplicates([1, 2, 3, 4, 1, [ 3 ], [ 4 ], [ 3 ] ]);
// [ { value: 1, index: 4 }, { value: [ 3 ], index: 7 } ] // [ { value: 1, index: 4 }, { value: [ 3 ], index: 7 } ]
This basically creates a map of the array, calling .findIndex() on the remainder of the array, noting down the index of any duplicates, returning information about every item that has a duplicate and what the index of the duplicate is.
One nice thing about this is that it will work for triplicates or any amount of occurrences of a value.
I believe constructing a LUT is one of the most efficient ways when it comes to making comparisons. The following method constructs a LUT by utilizing Array.prototype.reduce()
and eventually mutates the original array by removing not only one but all duplicate elements regardless of how many there are.
var arr = [
[
11.31866455078125,
44.53836644772605
],
[
11.31866455078125,
44.53836644772605
],
[
11.371536254882812,
44.53836644772605
],
[
11.371536254882812,
44.50140292110874
]
];
arr.reduce((p,c,i)=> { var prop = c[0]+"" + c[1]+"";
p[prop] === void 0 ? p[prop] = i : p.dups.push(i);
return p;
},{dups:[]}).dups.reverse().forEach( i => arr.splice(i,1))
document.write('<pre>' + JSON.stringify(arr, 0, 2) + '</pre>');
However if you would like to have a new array by keeping the original then obviously it would be much faster procedure.
Lodash gives a lot of useful functions to achieve finding the first duplicate index.
Using the _.findIndex() and _.isEqual() the following code will find the first duplicate index:
var duplicateIndex = _.findIndex(array, function(value, index, collection) {
var equal = _.isEqual.bind(undefined, value);
return _.findIndex(collection.slice(0, index), equal) !== -1;
});
or a bit faster but more verbose:
var duplicateIndex = _.findIndex(array, function(value, index, collection) {
var equal = _.isEqual.bind(undefined, value);
return _.findIndex(collection, function(val, ind) {
return ind < index && equal(val);
}) !== -1;
});
Notice that if no duplicate exists, -1
will be returned.
In a few words the algorithm iterates through array and looks back if the current element does not exist already. If it does, just return the current iteration index.
Please check the working demo.