Winsock 2 Reading text from a URL

爱⌒轻易说出口 提交于 2019-12-08 09:36:52

问题


For example, this is what I'm wanting to do:

    if (http->Connect("http://pastebin.com/raw/9uL16CyN"))
    {
        YString data = "";
        if (http->ReceiveData(data))
        {
            std::cout << "Networked data: " << std::endl;
            std::cout << data << std::endl;
        }
        else
            std::cout << "Failed to connect to internet.\n";
    }

The page I'm trying to read from is a raw ASCII text (http://pastebin.com/raw/9uL16CyN)

I was hoping this would work easily, but apparently not, I keep getting the WSA error: WSAHOST_NOT_FOUND (11001)

My Connect function:

bool Http::Connect(YString addr)
{
    _socket = Network::CreateConnectSocket(addr, 53); // 53 is the port
    return _socket != INVALID_SOCKET;
}

CreateConnectSocket function:

int iResult;
SOCKET ConnectSocket = INVALID_SOCKET;

// holds address info for socket to connect to
struct addrinfo *result = NULL,
    *ptr = NULL,
    hints;

ZeroMemory(&hints, sizeof(hints));
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_protocol = IPPROTO_TCP;  //TCP connection!!!

                                  //resolve server address and port
iResult = getaddrinfo(addr.c_str(), std::to_string(port).c_str(), &hints, &result);
if (iResult != 0)
{
    printf("Network::CreateSocket failed with %s as addr, and %i as port.\nError code: %i.\n", (char*)addr.c_str(), port, iResult);
    return INVALID_SOCKET;
}

for (ptr = result; ptr != NULL; ptr = ptr->ai_next) {

    // Create a SOCKET for connecting to server
    ConnectSocket = socket(ptr->ai_family, ptr->ai_socktype, ptr->ai_protocol);

    if (ConnectSocket == INVALID_SOCKET) {
        printf("Network::CreateSocket failed with error: %ld\n", WSAGetLastError());
        return INVALID_SOCKET;
    }

    // Connect to server.
    iResult = connect(ConnectSocket, ptr->ai_addr, (int)ptr->ai_addrlen);

    if (iResult == SOCKET_ERROR)
    {
        closesocket(ConnectSocket);
        ConnectSocket = INVALID_SOCKET;
        printf("Network::CreateSocket failed the server is down... did not connect.\n");
    }
}

freeaddrinfo(result);

if (ConnectSocket == INVALID_SOCKET)
{
    printf("Network::CreateSocket failed.\n");
    return INVALID_SOCKET;
}

u_long iMode = 1;
iResult = ioctlsocket(ConnectSocket, FIONBIO, &iMode);
if (iResult == SOCKET_ERROR)
{
    printf("Network::CreateSocket ioctlsocket failed with error: %d\n", WSAGetLastError());
    closesocket(ConnectSocket);
    return INVALID_SOCKET;
}
char value = 1;
setsockopt(ConnectSocket, IPPROTO_TCP, TCP_NODELAY, &value, sizeof(value));
return ConnectSocket;

Most of its taken from existing sources.


回答1:


Your call to Connect() is wrong. You cannot pass the full URL to getaddrinfo(). You need to pass only the domain name and port number by themselves. BTW, the HTTP port is 80, not 53.

Also, you are not sending an HTTP GET request to the server asking it to send you the text document. An HTTP server will not send a response until you send a request first.

You need something more like this instead:

bool Http::Connect(YString addr, int port)
{
    _socket = Network::CreateConnectSocket(addr, port);
    return _socket != INVALID_SOCKET;
}

if (http->Connect("pastebin.com", 80))
{
    YString data = "GET /raw/9uL16CyN HTTP/1.1\r\n"
                   "Host: pastebin.com\r\n"
                   "Connection: close\r\n"
                   "\r\n";

    if (http->SendData(data))
    {
        YString data = "";
        if (http->ReceiveData(data))
        {
            std::cout << "Networked data: " << std::endl;
            std::cout << data << std::endl;
        }
        else
            std::cout << "Failed to receive data from internet.\n";
    }
    else
        std::cout << "Failed to send request to Pastebin.\n";
}
else
    std::cout << "Failed to connect to Pastebin.\n";

That being said, you need to take into account that the server is going to frame the response data with headers, eg:

GET /raw/9uL16CyN HTTP/1.1
Host: pastebin.com

HTTP/1.1 200 OK
Date: Wed, 23 Dec 2015 00:00:01 GMT
Content-Type: text/plain; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=db6ba4b037d673b67757500aca4e2227b1450828801; expires=Thu, 22-Dec-16 00:00:01 GMT; path=/; domain=.pastebin.com; HttpOnly
X-Powered-By: PHP/5.5.5
Cache-Control: public, max-age=1801
Vary: Accept-Encoding
CF-Cache-Status: HIT
Expires: Wed, 23 Dec 2015 00:30:02 GMT
Server: cloudflare-nginx
CF-RAY: 258fc8a8168a2276-LAX

2a
Text, text, text, text! Some more text! :D
0

So, assuming ReceiveData() is just returning whatever it receives, you will have to strip those headers off, and undo the chunked encoding, before you can use the content of the text file by itself. Please read RFC 2616 (or its successors RFCs 7230-7235), which outline the HTTP protocol in detail.

That being said, you should stop trying to implement HTTP manually (it is more complex then you realize) and use a pre-existing library instead, just as libcurl, or even Microsoft's own WinInet or WinHTTP APIs. Let them do the heavy work for you.



来源:https://stackoverflow.com/questions/34387033/winsock-2-reading-text-from-a-url

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!