NullReferenceException with HtmlDocument reference in C#

浪尽此生 提交于 2019-12-13 16:22:07

问题


I am using HtmlAgilityPack in order to scrape information off of Google Translate for a translation program. I have downloaded the HtmlAgilityPack dll, and successfully referenced it in my program. I am using Assembly in Unity. Below is my code for the two programs:

using UnityEngine;
using System.Collections;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Net;
using HtmlAgilityPack;

public class GUIScript : MonoBehaviour {
    private string textField = "";
    private string input;
    public Texture2D icon;
    Dictionary search;
    Encoding code;
    // Use this for initialization
    void Start () { 
        search = new Dictionary();
        input = " ";
        code = Encoding.UTF8;
        //This is what is run to translate
        print (search.Translate("Hola","es|en",code));
    }

    // Update is called once per frame
    void Update () {

    }
    void OnGUI(){
        textField = GUI.TextField(new Rect(0, Screen.height -50, Screen.width-80, 40), textField);
        if(GUI.Button(new Rect(Screen.width-80, Screen.height -50, 80,40), icon)){
            input = textField;
            textField = "";

        }
        //GUI.Label(new Rect(0,Screen.height -70, Screen.width-80,20), search.Translate("Hola","es|en",code));
        //print (search.Translate("Hola","es|en",code));
    }
}

This is the code that references my Dictionary class shown below:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using UnityEngine;
using System.Collections;
using System.Net;
using HtmlAgilityPack;


public class Dictionary{
    string[] formatParams;
    HtmlDocument doc;
    public Dictionary(){
        formatParams = new string[2];
        doc = new HtmlDocument();
    }
    public string Translate(String input, String languagePair, Encoding encoding)
     {
        formatParams[0]= input;
        formatParams[1]= languagePair;
        string url = String.Format("http://www.google.com/translate_t?hl=en&ie=UTF8&text={0}&langpair={1}", formatParams);

        string result = String.Empty;

        using (WebClient webClient = new WebClient())
        {
            webClient.Encoding = encoding;
            result = webClient.DownloadString(url);
        }       
        doc.LoadHtml(result);
        return doc.DocumentNode.SelectSingleNode("//span[@title=input]").InnerText;
    }
    // Use this for initialization
    void Start () {

    }
}

When running this, I receive the error:

NullReferenceException: Object reference not set to an instance of an object
Dictionary.Translate (System.String input, System.String languagePair,System.Text.Encoding encoding) (at Assets/Dictionary.cs:32)
GUIScript.Start () (at Assets/GUIScript.cs:22)

I have tried changing code, looking up solutions, the API for HtmlDocument, and how to fix NullReferenceExceptions, but for some reason I cannot figure out why I am getting a NullReferenceException. This problem has been holding me back for a week or two now and I need to move on with my project. Any help would be greatly appreciated!


回答1:


If I've counted correctly, this is line 32:

return doc.DocumentNode.SelectSingleNode("//span[@title=input]").InnerText

That means either doc.DocumentNode is null or DocumentNode.SelectSingleNode("//span[@title=input]") is returning null.

If it is the former, check you are receiving an actual document back. Your URL may not be encoded correctly. See also why HTML Agility Pack HtmlDocument.DocumentNode is null?

If it is the latter, it could be something odd happening with XPath. I don't know how relevant this is as DocumentNode should be the root of the document, the discussion at http://htmlagilitypack.codeplex.com/discussions/249129 could apply. According to this, '//' is searches from the root of the document, and you may have to try doc.DocumentNode.SelectSingleNode(".//span[@title=input]") instead (adding a . to the beginning of the string).

Debugging the method and seeing exactly the values of these calls will finish the job.




回答2:


What are you trying to retrieve? I opened the url you are using and did a simple find for title=input and returned nothing. I am guesing that your are looking for the translation of Hola, being Hello?

If so I did this in a console app. Hope this helps.

    static void Main(string[] args)
    {
        string Input = "Hola";
        HtmlWeb web = new HtmlWeb();
        HtmlDocument doc = web.Load("http://www.google.com/translate_t?hl=en&ie=UTF8&text=Hola&langpair=es|en");

        string definition = doc.DocumentNode.SelectSingleNode(string.Format("//span[@title='{0}']",Input)).InnerText;
        Console.WriteLine(definition);
        Console.ReadKey();
    }

EDIT: Just realised you weren't looking for title=input but title=Hola. As you see in my code try String.Format(("//span[@title='{0}']",Input). That will insert the text of the variable Input into the string line.



来源:https://stackoverflow.com/questions/13031757/nullreferenceexception-with-htmldocument-reference-in-c-sharp

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!