Best ways of parsing a URL using C?

后端 未结 10 1731
死守一世寂寞
死守一世寂寞 2020-11-27 15:36

I have a URL like this:

http://192.168.0.1:8080/servlet/rece

I want to parse the URL to get the values:

IP: 192.168.0.1
Por         


        
相关标签:
10条回答
  • 2020-11-27 16:00

    This one has reduced size and worked excellent for me http://draft.scyphus.co.jp/lang/c/url_parser.html . Just two files (*.c, *.h).
    I had to adapt code [1].

    [1]Change all the function calls from http_parsed_url_free(purl) to parsed_url_free(purl)

       //Rename the function called
       //http_parsed_url_free(purl);
       parsed_url_free(purl);
    
    0 讨论(0)
  • 2020-11-27 16:02

    With a regular expression if you want the easy way. Otherwise use FLEX/BISON.

    You could also use a URI parsing library

    0 讨论(0)
  • 2020-11-27 16:07

    Personally, I steal the HTParse.c module from the W3C (it is used in the lynx Web browser, for instance). Then, you can do things like:

     strncpy(hostname, HTParse(url, "", PARSE_HOST), size)
    

    The important thing about using a well-established and debugged library is that you do not fall into the typical traps of URL parsing (many regexps fail when the host is an IP address, for instance, specially an IPv6 one).

    0 讨论(0)
  • 2020-11-27 16:07

    This C gist could be useful. It implements a pure C solution with sscanf.

    https://github.com/luismartingil/per.scripts/tree/master/c_parse_http_url

    It uses

    // Parsing the tmp_source char*
    if (sscanf(tmp_source, "http://%99[^:]:%i/%199[^\n]", ip, &port, page) == 3) { succ_parsing = 1;}
    else if (sscanf(tmp_source, "http://%99[^/]/%199[^\n]", ip, page) == 2) { succ_parsing = 1;}
    else if (sscanf(tmp_source, "http://%99[^:]:%i[^\n]", ip, &port) == 2) { succ_parsing = 1;}
    else if (sscanf(tmp_source, "http://%99[^\n]", ip) == 1) { succ_parsing = 1;}
    (...)
    
    0 讨论(0)
提交回复
热议问题