C# Webclient returning error 404

試著忘記壹切 提交于 2019-12-25 09:18:33

问题


I'm using below script to retrieve HTML from an URL.

string webURL = @"https://nl.wiktionary.org/wiki/" + word.ToLower();
                using (WebClient client = new WebClient())
                {
                      string htmlCode = client.DownloadString(webURL);                
                }

The variable word can be any word. In case there is no WIKI page for the "word" be retrieved the code is ending in error with code 404, while retrievng the URL with a browser opens a WIKI page, saying there is no page for this item yet.

What I want is that the code always gets the HTML, also when the WIKI page says there is no info yet. I do not want to avoid the error 404 with a try and catch.

Does anyone has an idea why this is not working with a Webclient?


回答1:


try this. You can catch the 404 error content in a try catch block.

        var word = Console.ReadLine();
        string webURL = @"https://nl.wiktionary.org/wiki/" + word.ToLower();
        using (WebClient client = new WebClient() {  })
        {
            try
            {

                string htmlCode = client.DownloadString(webURL);

            }
            catch (WebException exception)
            {
                string responseText=string.Empty;

                var responseStream = exception.Response?.GetResponseStream();

                if (responseStream != null)
                {
                    using (var reader = new StreamReader(responseStream))
                    {
                        responseText = reader.ReadToEnd();
                    }
                }

                Console.WriteLine(responseText);
            }
        }

        Console.ReadLine();



回答2:


Since this WIKI-server use case-sensitive url mapping, just don't modify case of URL to harvest (remove ".ToLower()" from you code).

Ex.: Lower case:
https://nl.wiktionary.org/wiki/categorie:onderwerpen_in_het_nynorsk
Result: HTTP 404(Not Found)

Normal (unmodified) case:
https://nl.wiktionary.org/wiki/Categorie:Onderwerpen_in_het_Nynorsk
Result: HTTP 200(OK)

Also, keep in mind what most (if not all) WiKi servers (including this one) generates custom 404 pages, so in browser they looks like "normal" pages, but, despite this, they are serving with 404 http code.



来源:https://stackoverflow.com/questions/44971530/c-sharp-webclient-returning-error-404

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!