I want to be able to convert a CSV to JSON. The csv comes in as free text like this (with the newlines):
name,age,booktitle
John,2,Hello World
Mary,3,\"\"Ala
My first guess is to use a regular expression. You can try this one I've just whipped up (regex101 link):
/(?:[\t ]?)+("+)?(.*?)\1(?:[\t ]?)+(?:,|$)/gm
This can be used to extract fields. So, you can grab headers with it as well. The first capture group is used as an optional quote-grabber with a backreference, so the actual data is in matchAll(regex)[2]
. A filter is used to cut off the last match in all cases, since allowing for blank fields (f1,,f3
) put a zero-width match at the end. This was easier to get rid of with JavaScript rather than in the regex.
const csvToJson = (str, headerList, quotechar = '"', delimiter = ',') => {
const cutlast = (_, i, a) => i < a.length - 1;
// const regex = /(?:[\t ]?)+("+)?(.*?)\1(?:[\t ]?)+(?:,|$)/gm; // no variable chars
const regex = new RegExp(`(?:[\\t ]?)+(${quotechar}+)?(.*?)\\1(?:[\\t ]?)+(?:${delimiter}|$)`, 'gm');
const lines = str.split('\n');
const headers = headerList || lines.splice(0, 1)[0].match(regex).filter(cutlast);
const list = [];
for (const line of lines) {
const val = {};
for (const [i, m] of [...line.matchAll(regex)].filter(cutlast).entries()) {
// Attempt to convert to Number if possible, also use null if blank
val[headers[i]] = (m[2].length > 0) ? Number(m[2]) || m[2] : null;
}
list.push(val);
}
return list;
}
const testString = `name,age,booktitle
John,,Hello World
Mary,3,""Alas, What Can I do?""
Joseph,5,"Waiting, waiting, waiting"
"Donaldson Jones" , six, "Hello, friend!"`;
console.log(csvToJson(testString));
console.log(csvToJson(testString, ['foo', 'bar', 'baz']));
As a bonus, I've written this to allow for the passing of a list of strings to use as the headers instead, since I know first hand that not all CSV files have those.
PS: If you don't like my regex then you can check out this much more complex one that adheres to the CSV standard, instead of just grabbing everything.