问题
Context:
I am trying to parse the "Cities" from this Page here. I already managed to simulate the request for the data of this combobox, which is a Ajax call.
Fiddler Request :
POST http://www.telelistas.net/AjaxHandler.ashx HTTP/1.1
Host: www.telelistas.net
Connection: keep-alive
Content-Length: 106
Origin: http://www.telelistas.net
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.11 (KHTML, like Gecko) Chrome/23.0.1271.97 Safari/537.11
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
Accept: */*
Referer: http://www.telelistas.net/
Accept-Encoding: gzip,deflate,sdch
Accept-Language: pt-BR,pt;q=0.8,en-US;q=0.6,en;q=0.4
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: cert_Origin=directo; email=bdc.testes@gmail.com; auto=automatico=0; searchparameters=bottom=0&btnsite=0&email=&uf=rj&origem=0&nome=&pagina=1&codlogradouro=&predio=213&tiquete=0&localidadeendmap=&codbairro=0&pcount=25&estacionamento=0&letra=&top=&entrega=0&pchave=&info=&logradouro=rua+da+lapa&codtitulo=-1&chave=&zoom=&comercial=0&ddd=0&comib=0&btnemail=0&pgresultado=&localidade=&telefone=&manobrista=0&codlocalidade=21000&site=&cartoes=0&atividade=&bairro=&reserva=0&residencial=0; perfil=logged=1&iduser=2563063&email=bdc.testes@gmail.com&usertype=2&specialsearch=3&siteusernome=BigDataCorp&siteuserdatanasc=15/01/1988&siteusersexo=M&siteuserlocalidade=21000&siteuseruf=RJ&siteuserddd=21&siteusertelefone=94118439&siteuserprofissao=4&siteuserrenda=5000&siteuserformacao=4&siteusernovidades=0&siteusernovidadesrevista=&siteusernovidadesparceiros=0&siteusercpf=10541308769&siteuseracesso=brasil&siteusercep=22631000&siteuseridade=24&siteuserparceiro=telelistas&siteuserconhecimento=2&siteuseroperadora=oi&siteuserurlorigem=http://www.telelistas.net/&siteuserdatacadastro=13/12/2012 11:45:00; __utma=70879631.392027796.1355939587.1356014801.1356021821.5; __utmb=70879631.1.10.1356021821; __utmc=70879631; __utmz=70879631.1355939587.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)
PostData : state=rj&style=busca_interna&selectedCity=21000&clientId=pch_localidade_select&method=GetSearchCitiesNamed
Issue:
Here is a fragment of the string returned by this request :
<select name='pch_localidade_select' class='busca_interna' id='pch_localidade_select' tabindex="4"><option value="">Selecione</option><option selected value="21000">Rio de Janeiro</option><option value="21380">Abraão</option><option value="21001">Afonso Arinos</option><option value="21002">Agência Luterback</option><option value="21847">Agriões de Dentro</option>
What i am trying to do, is to reach the InnerText
of the Option tags ("Rio de Janeiro", "Abraao"...), but for some weird reason, the InnerText
is always empty, for every option node found.
There's some code fragment that is failing :
// Iterating over nodes to build the dictionary
foreach (HtmlNode city in citiesNodes)
{
string key = city.InnerText;
string value = city.Attributes["value"].Value;
citiesHash.AddCity (key,value);
}
Technology in Place:
I am using HtmlAgilityPack that supports XPath syntax for node selecting, C# code and Fiddler2 for WebDebugging.
Thanks in advance
回答1:
Just use HtmlAgilityPack.HtmlNode.ElementsFlags.Remove("option");
before loading html
HtmlAgilityPack.HtmlNode.ElementsFlags.Remove("option");
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
var options = doc.DocumentNode.Descendants("option").Skip(1)
.Select(n => new
{
Value = n.Attributes["value"].Value,
Text = n.InnerText
})
.ToList();
回答2:
For some weird reason, HtmlAgilityPack does not handles those tags correctly, so this managed to solve my problem.
// Iterating over nodes to build the dictionary
foreach (HtmlNode city in citiesNodes)
{
if (city.NextSibling != null)
{
string key = city.NextSibling.InnerText;
string value = city.Attributes["value"].Value;
citiesHash.AddCity (key,value);
}
}
Instead of reaching directly the node,i managed to get the values of each node by using the NextSimbling
reference from the previous simbling.
来源:https://stackoverflow.com/questions/13977243/how-can-i-parse-innertext-of-option-tag-with-htmlagilitypack