I'm adding a function to my own personal toolkit lib to do simple CSV to HTML table conversion.
I would like the smallest possible piece of code to do this in C#, and it needs to be able to handle CSV files in excess of ~500mb.
So far my two contenders are
splitting csv into arrays by delimiters and building HTML output
search-replace delimiters with table th tr td tags
Assume that the file/read/disk operations are already handled... i.e., i'm passing a string containing the contents of said CSV into this function. The output will consist of straight up simple HTML style-free markup, and yes the data may have stray commas and breaks therein.
update: some folks asked. 100% of the CSV i deal with comes straight out of excel if that helps.
Example string:
a1,b1,c1\r\n a2,b2,c2\r\n
Read All Lines into Memory
var lines =File.ReadAllLines(args[0]);
using (var outfs = File.AppendText(args[1]))
{
outfs.Write("<html><body><table>");
foreach (var line in lines)
outfs.Write("<tr><td>" + string.Join("</td><td>", line.Split(',')) + "</td></tr>");
outfs.Write("</table></body></html>");
}
or Read one line at a time
using (var inFs = File.OpenText(args[0]))
using (var outfs = File.AppendText(args[1]))
{
outfs.Write("<html><body><table>");
while (!inFs.EndOfStream )
outfs.Write("<tr><td>" + string.Join("</td><td>", inFs.ReadLine().Split(',')) + "</td></tr>");
outfs.Write("</table></body></html>");
}
... @Jimmy ... I created an extended version using LINQ. Here is the highlight ... (lazy eval for line reading)
using (var lp = args[0].Load())
lp.Select(l => "<tr><td>" + string.Join("</td><td>", l.Split(',')) + "</td></tr>")
.Write("<html><body><table>", "</table></body></html>", args[1]);
probably not much shorter you can get than this, but just remember that any real solution would handle quotes, commas inside of quotes, and conversions to html entities.
return "<table><tr><td>"+s
.Replace("\n","</td></tr><tr><td>")
.Replace(",","</td><td>")+"</td></tr></table>";
EDIT: here's (largely untested) addition of htmlencode and quote-matching. I htmlencode first, then all commas become '<' (which don't collide because the existing ones have been encoded already.
bool q=false;
return "<table><tr><td>"
+ new string(HttpUtility.HtmlEncode(s)
.Select(c=>c=='"'?(q=!q)?c:c:(c==','&&!q)?'<':c).ToArray())
.Replace("<", "</td><td>")
.Replace("\n", "</td></tr><tr><td>")
+ "</td></tr></table>";
Here's a fun version using lambda expressions. It's not as short as replacing commas with "</td><td>"
, but it has it's own special charm:
var r = new StringBuilder("<table>");
s.Split('\n').ToList().ForEach(t => r.Append("<tr>").Append(t.Split(',').Select(u => "<td>" + u + "</td>")).Append("</tr>"));
return r.Append("</table>").ToString();
If I were to right this for production, I'd use a state machine to track nested quotes, newlines, and commas, because excel can put new lines in the middle of column. IIRC you can also specify a different delimiter entirely.
来源:https://stackoverflow.com/questions/966889/codegolf-convert-csv-to-html-table-with-smallest-amount-of-code-in-c-sharp