I\'m trying to parse a wikispaces page but I\'m not able to get the actual page. I\'m not sure if its a HttpClient bug or some missing configuration.
This is my code:
The reason the HttpClient isn't redirecting properly is because the site is redirecting you to HTTPS and then back to HTTP. A quick fix is to GET https://medivia.wikispaces.com/Monsters
instead, which is a better idea anyhow:
HttpResponseMessage response = await _httpClient.GetAsync("https://medivia.wikispaces.com/Monsters");
// Works fine! :)
I was curious why it didn't work the first way, so I dug a little deeper. If you watch the network trace in a browser or a network client, this is what happens:
GET http://medivia.wikispaces.com/Monsters
302 https://session.wikispaces.com/1/auth/auth?authToken=token
302 http://medivia.wikispaces.com/Monsters?redirectToken=token
That 302 from an encrypted (HTTPS) connection to an unencrypted one is causing the HttpClientHandler to stop automatically following. From what I can tell, this is a security quirk of the Windows implementation of HttpClientHandler, because the Unix one didn't seem to care about the HTTPS->HTTP redirect in my informal testing.