问题
I am using HtmlAgilityPack to parse and manipulate html text. However it seems the DocumentNode.OuterHtml gives missing closing tags.
To isolate the issue now I am doing nothing else just parse and get the OuterHtml (no manipulation):
var document = new HtmlDocument();
document.LoadHtml(myHtml);
result = document.DocumentNode.OuterHtml;
Original: (myHtml)
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="X-UA-Compatible" content="IE=Edge" /><title>
MyTitle
</title>
OutputHtml: (result) Notice that meta element is not closed
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head><meta http-equiv="X-UA-Compatible" content="IE=Edge"><title>
MyTitle
</title>
Similarly all input and img elements are leaved open. (Please do not answer that it should not be a problem. Well it should not be, but it is.) Chrome can not render the page correctly. Keep reading.
What is more weird:
Original: (myHtml)
<option value="10">Afrikaans</option>
<option value="11">Albanian</option>
<option value="12">Arabic</option>
<option value="13">Armenian</option>
<option value="14">Azerbaijani</option>
<option value="15">Basque</option>
OutputHtml: (result) Notice that that complete explicit closing tags are missing
<option value="10">Afrikaans
<option value="11">Albanian
<option value="12">Arabic
<option value="13">Armenian
Using HtmlAgilitPack latest NuGet package: id="HtmlAgilityPack" version="1.4.9"
回答1:
There are several options that you can set when you are loading the document.
OptionAutoCloseOnEnd
Defines if closing for non closed nodes must be done at the end or directly in the document. Setting this to true can actually change how browsers render the page.
document = new HtmlDocument();
document.OptionAutoCloseOnEnd = true;
document.LoadHtml(content);
Related sources worth reading:
HtmlAgilityPack Drops Option End Tags
Image tag not closing with HTMLAgilityPack
来源:https://stackoverflow.com/questions/35179687/htmlagilitypack-produces-missing-closing-tags-in-outerhtml