Read a HTML file into a string variable in memory

我的未来我决定 提交于 2020-03-13 04:39:11


If I have a HTML file on disk, How can I read it all at once in to a String variable at run time? Then I need to do some processing on that string variable.

Some html file like this:

    <table cellspacing="0" cellpadding="0" rules="all" border="1" style="border-width:1px;border-style:solid;width:274px;border-collapse:collapse;">
        <COLGROUP><col width=35px><col width=60px><col width=60px><col width=60px><col width=59px></COLGROUP>
        <tr style="height:20px;">
            <th style="background-color:#A9C4E9;"></th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">A</th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">B</th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">C</th><th align="center" valign="middle" style="color:buttontext;background-color:#D3DCE9;">D</th>
        </tr><tr style="height:20px;">
            <th align="center" valign="middle" style="color:buttontext;background-color:#E4ECF7;">1</th><td align="left" valign="top" style="color:windowtext;background-color:window;">Hi</td><td align="left" valign="top" style="color:windowtext;background-color:window;">Cell Two</td><td align="left" valign="top" style="color:windowtext;background-color:window;">Actually a longer text</td><td align="left" valign="top" style="color:windowtext;background-color:window;">Final Word</td>


Use File.ReadAllText passing file location as an argument.

However, if your real goal is to parse html then I would recommend using Html Agility Pack.


Use System.IO.File.ReadAllText(fileName)


string html = File.ReadAllText(path);


This is mostly covered already, but one addition as I ran into an issue with the previous code samples.

Dim strHTML as String = System.IO.File.ReadAllText(HttpContext.Current.Server.MapPath("~/folder/filename.html"))


Use File.ReadAllText(path_to_file) to read


What kind of processing are you trying to do? You can do XmlDocument doc = new XmlDocument(); followed by doc.Load(filename). Then the XML document can be parsed in memory.

Read here for more information on XmlDocument:

  • MSDN
  • C# Corner tutorial


You can do it the simple way:

string pathToHTMLFile = @"C:\temp\someFile.html";
string htmlString = File.ReadAllText(pathToHTMLFile);

Or you could stream it in with FileStream/StreamReader:

using (FileStream fs = File.Open(pathToHTMLFile, FileMode.Open, FileAccess.ReadWrite))
    using (StreamReader sr = new StreamReader(fs))
        htmlString = sr.ReadToEnd();

This latter method allows you to open the file while still permitting others to perform Read/Write operations on the file. I can't imagine an HTML file being very big, but it has the added benefit of streaming the file instead of capturing it as one large chunk like the first method.

