I\'m trying to figure out how to filter out duplicates in a string with a regular expression, where the string is comma separated. I\'d like to do this in javascript, but I\
With javascript regex
x="1,1,1,2,2,3,3,3,3,4,4,4,5"
while(/(\d),\1/.test(x))
x=x.replace(/(\d),\1/g,"$1")
1,2,3,4,5
x="a,b,b,said,said, t, u, ugly, ugly"
while(/\s*([^,]+),\s*\1(?=,|$)/.test(x))
x=x.replace(/\s*([^,]+),\s*\1(?=,|$)/g,"$1")
a,b,said, t, u,ugly
Not well tested, let me know if there is any issue.
Why use regex when you can do it in javascript code? Here is sample code (messy though):
var input = 'a,b,b,said,said, t, u, ugly, ugly';
var splitted = input.split(',');
var collector = {};
for (i = 0; i < splitted.length; i++) {
key = splitted[i].replace(/^\s*/, "").replace(/\s*$/, "");
collector[key] = true;
}
var out = [];
for (var key in collector) {
out.push(key);
}
var output = out.join(','); // output will be 'a,b,said,t,u,ugly'
p/s: that one regex in the for-loop is to trim the tokens, not to make them unique
If you insist on RegExp, here's an example in Javascript:
"1,1,1,2,2,3,3,3,3,4,4,4,5".replace (
/(^|,)([^,]+)(?:,\2)+(,|$)/ig,
function ($0, $1, $2, $3)
{
return $1 + $2 + $3;
}
);
To handle trimming of whitespace, modify slightly:
"1,1,1,2,2,3,3,3,3,4,4,4,5".replace (
/(^|,)\s*([^,]+)\s*(?:,\s*\2)+\s*(,|$)\s*/ig,
function ($0, $1, $2, $3)
{
return $1 + $2 + $3;
}
);
That said, it seems better to tokenise via split
and handle duplicates.
I don't use Regular Expressions for that.
Here's the function I use. It accepts a string containing comma separated values and returns an array of unique values regardless of position in the original string.
Note: If you pass CSV string containing quoted values, Split will not treat commas inside quoted values any differently. So if you want to handle real CSV, you are best to use a 3rd party CSV parser.
function GetUniqueItems(s)
{
var items=s.split(",");
var uniqueItems={};
for (var i=0;i<items.length;i++)
{
var key=items[i];
var val=items[i];
uniqueItems[key]=val;
}
var result=[];
for(key in uniqueItems)
{
// Assign to output result field using hasOwnProperty so we only get
// relevant items
if(uniqueItems.hasOwnProperty(key))
{
result[result.length]=uniqueItems[key];
}
}
return result;
}
Here's a example:
s/,([^,]+),\1/,$1/g;
Perl regex substitution, but should be convertible to JS-style by anyone who knows the syntax.