问题
I am using HtmlAgilityPack
in order to scrape information off of Google Translate for a translation program. I have downloaded the HtmlAgilityPack
dll, and successfully referenced it in my program. I am using Assembly in Unity. Below is my code for the two programs:
using UnityEngine;
using System.Collections;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using HtmlAgilityPack;
public class GUIScript : MonoBehaviour {
private string textField = "";
private string input;
public Texture2D icon;
Dictionary search;
Encoding code;
// Use this for initialization
void Start () {
search = new Dictionary();
input = " ";
code = Encoding.UTF8;
//This is what is run to translate
print (search.Translate("Hola","es|en",code));
}
// Update is called once per frame
void Update () {
}
void OnGUI(){
textField = GUI.TextField(new Rect(0, Screen.height -50, Screen.width-80, 40), textField);
if(GUI.Button(new Rect(Screen.width-80, Screen.height -50, 80,40), icon)){
input = textField;
textField = "";
}
//GUI.Label(new Rect(0,Screen.height -70, Screen.width-80,20), search.Translate("Hola","es|en",code));
//print (search.Translate("Hola","es|en",code));
}
}
This is the code that references my Dictionary
class shown below:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using UnityEngine;
using System.Collections;
using System.Net;
using HtmlAgilityPack;
public class Dictionary{
string[] formatParams;
HtmlDocument doc;
public Dictionary(){
formatParams = new string[2];
doc = new HtmlDocument();
}
public string Translate(String input, String languagePair, Encoding encoding)
{
formatParams[0]= input;
formatParams[1]= languagePair;
string url = String.Format("http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}", formatParams);
string result = String.Empty;
using (WebClient webClient = new WebClient())
{
webClient.Encoding = encoding;
result = webClient.DownloadString(url);
}
doc.LoadHtml(result);
return doc.DocumentNode.SelectSingleNode("//span[@title=input]").InnerText;
}
// Use this for initialization
void Start () {
}
}
When running this, I receive the error:
NullReferenceException: Object reference not set to an instance of an object
Dictionary.Translate (System.String input, System.String languagePair,System.Text.Encoding encoding) (at Assets/Dictionary.cs:32)
GUIScript.Start () (at Assets/GUIScript.cs:22)
I have tried changing code, looking up solutions, the API for HtmlDocument
, and how to fix NullReferenceExceptions
, but for some reason I cannot figure out why I am getting a NullReferenceException
. This problem has been holding me back for a week or two now and I need to move on with my project. Any help would be greatly appreciated!
回答1:
If I've counted correctly, this is line 32:
return doc.DocumentNode.SelectSingleNode("//span[@title=input]").InnerText
That means either doc.DocumentNode
is null or DocumentNode.SelectSingleNode("//span[@title=input]")
is returning null.
If it is the former, check you are receiving an actual document back. Your URL may not be encoded correctly. See also why HTML Agility Pack HtmlDocument.DocumentNode is null?
If it is the latter, it could be something odd happening with XPath. I don't know how relevant this is as DocumentNode
should be the root of the document, the discussion at http://htmlagilitypack.codeplex.com/discussions/249129 could apply. According to this, '//' is searches from the root of the document, and you may have to try doc.DocumentNode.SelectSingleNode(".//span[@title=input]")
instead (adding a .
to the beginning of the string).
Debugging the method and seeing exactly the values of these calls will finish the job.
回答2:
What are you trying to retrieve? I opened the url you are using and did a simple find for title=input
and returned nothing. I am guesing that your are looking for the translation of Hola, being Hello?
If so I did this in a console app. Hope this helps.
static void Main(string[] args)
{
string Input = "Hola";
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load("http://www.google.com/translate_t?hl=en&ie=UTF8&text=Hola&langpair=es|en");
string definition = doc.DocumentNode.SelectSingleNode(string.Format("//span[@title='{0}']",Input)).InnerText;
Console.WriteLine(definition);
Console.ReadKey();
}
EDIT: Just realised you weren't looking for title=input
but title=Hola
. As you see in my code try String.Format(("//span[@title='{0}']",Input)
. That will insert the text of the variable Input
into the string line.
来源:https://stackoverflow.com/questions/13031757/nullreferenceexception-with-htmldocument-reference-in-c-sharp