Reading data from URL

前端 未结 5 1671
生来不讨喜
生来不讨喜 2021-02-19 00:59

Is there a reasonably easy way to get data from some url? I tried the most obvious version, does not work:

readcsv(\"https://dl.dropboxusercontent.com/u/.../test         


        
相关标签:
5条回答
  • 2021-02-19 01:14

    The Requests package seems to work pretty well. There are others (see the entire package list) but Requests is actively maintained.

    Obtaining it

    julia> Pkg.add("Requests")
    
    julia> using Requests
    

    Using it

    You can use one of the exported functions that correspond to the various HTTP verbs get, post, etc which returns a Response type

    julia> res = get("http://julialang.org")
    Response(200 OK, 21 Headers, 20913 Bytes in Body)
    
    julia> typeof(res)
    Response (constructor with 8 methods)
    

    And then, for example, you can print the data using @printf

    julia> @printf("%s",res.data);
    <!DOCTYPE html>
    <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-us" lang="en-us">
    <head>
      <meta http-equiv="content-type" content="text/html; charset=utf-8" />
    ...
    
    0 讨论(0)
  • 2021-02-19 01:15

    If you want to read a CSV from a URL, you can use the Requests package as @waTeim shows and then read the data through an IOBuffer. See example below.

    Or, as @Colin T Bowers comments, you could use the currently (December 2017) more actively maintained HTTP.jl package like this:

    julia> using HTTP
    
    julia> res = HTTP.get("https://www.ferc.gov/docs-filing/eqr/q2-2013/soft-tools/sample-csv/transaction.txt");
    
    julia> mycsv = readcsv(res.body);
    
    julia> for (colnum, myheader) in enumerate(mycsv[1,:])
               println(colnum, '\t', myheader)
           end
    1   transaction_unique_identifier
    2   seller_company_name
    3   customer_company_name
    4   customer_duns_number
    5   tariff_reference
    6   contract_service_agreement
    7   trans_id
    8   transaction_begin_date
    9   transaction_end_date
    10  time_zone
    11  point_of_delivery_control_area
    12  specific location
    13  class_name
    14  term_name
    15  increment_name
    16  increment_peaking_name
    17  product_name
    18  transaction_quantity
    19  price
    20  units
    21  total_transmission_charge
    22  transaction_charge
    

    Using the Requests.jl package:

    julia> using Requests
    
    julia> res = get("https://www.ferc.gov/docs-filing/eqr/q2-2013/soft-tools/sample-csv/transaction.txt");
    
    julia> mycsv = readcsv(IOBuffer(res.data));
    
    julia> for (colnum, myheader) in enumerate(mycsv[1,:])
             println(colnum, '\t', myheader)
           end
    1   transaction_unique_identifier
    2   seller_company_name
    3   customer_company_name
    4   customer_duns_number
    5   tariff_reference
    6   contract_service_agreement
    7   trans_id
    8   transaction_begin_date
    9   transaction_end_date
    10  time_zone
    11  point_of_delivery_control_area
    12  specific location
    13  class_name
    14  term_name
    15  increment_name
    16  increment_peaking_name
    17  product_name
    18  transaction_quantity
    19  price
    20  units
    21  total_transmission_charge
    22  transaction_charge
    
    0 讨论(0)
  • 2021-02-19 01:23

    If it is directly a csv file, something like this should work:

    A = readdlm(download(url),';')
    
    0 讨论(0)
  • 2021-02-19 01:26

    Nowadays you can also use UrlDownload.jl which is pure Julia, take care of download details, process data in-memory and can also work with compressed files.

    Usage is straightforward

    using UrlDownload
    
    A = urldownload("https://data.ok.gov/sites/default/files/unspsc%20codes_3.csv")
    
    0 讨论(0)
  • 2021-02-19 01:37

    If you are looking to read into a dataframe, this will also work in Julia:

    using CSV   
    
    dataset = CSV.read(download("https://mywebsite.edu/ml/machine-learning-databases/my.data"))
    
    0 讨论(0)
提交回复
热议问题