Is there a reasonably easy way to get data from some url? I tried the most obvious version, does not work:
readcsv(\"https://dl.dropboxusercontent.com/u/.../test
The Requests package seems to work pretty well. There are others (see the entire package list) but Requests is actively maintained.
julia> Pkg.add("Requests")
julia> using Requests
You can use one of the exported functions that correspond to the various HTTP verbs get, post, etc which returns a Response type
julia> res = get("http://julialang.org")
Response(200 OK, 21 Headers, 20913 Bytes in Body)
julia> typeof(res)
Response (constructor with 8 methods)
And then, for example, you can print the data using @printf
julia> @printf("%s",res.data);
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-us" lang="en-us">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
...
If you want to read a CSV from a URL, you can use the Requests package as @waTeim shows and then read the data through an IOBuffer. See example below.
Or, as @Colin T Bowers comments, you could use the currently (December 2017) more actively maintained HTTP.jl package like this:
julia> using HTTP
julia> res = HTTP.get("https://www.ferc.gov/docs-filing/eqr/q2-2013/soft-tools/sample-csv/transaction.txt");
julia> mycsv = readcsv(res.body);
julia> for (colnum, myheader) in enumerate(mycsv[1,:])
println(colnum, '\t', myheader)
end
1 transaction_unique_identifier
2 seller_company_name
3 customer_company_name
4 customer_duns_number
5 tariff_reference
6 contract_service_agreement
7 trans_id
8 transaction_begin_date
9 transaction_end_date
10 time_zone
11 point_of_delivery_control_area
12 specific location
13 class_name
14 term_name
15 increment_name
16 increment_peaking_name
17 product_name
18 transaction_quantity
19 price
20 units
21 total_transmission_charge
22 transaction_charge
Using the Requests.jl
package:
julia> using Requests
julia> res = get("https://www.ferc.gov/docs-filing/eqr/q2-2013/soft-tools/sample-csv/transaction.txt");
julia> mycsv = readcsv(IOBuffer(res.data));
julia> for (colnum, myheader) in enumerate(mycsv[1,:])
println(colnum, '\t', myheader)
end
1 transaction_unique_identifier
2 seller_company_name
3 customer_company_name
4 customer_duns_number
5 tariff_reference
6 contract_service_agreement
7 trans_id
8 transaction_begin_date
9 transaction_end_date
10 time_zone
11 point_of_delivery_control_area
12 specific location
13 class_name
14 term_name
15 increment_name
16 increment_peaking_name
17 product_name
18 transaction_quantity
19 price
20 units
21 total_transmission_charge
22 transaction_charge
If it is directly a csv file, something like this should work:
A = readdlm(download(url),';')
Nowadays you can also use UrlDownload.jl which is pure Julia, take care of download details, process data in-memory and can also work with compressed files.
Usage is straightforward
using UrlDownload
A = urldownload("https://data.ok.gov/sites/default/files/unspsc%20codes_3.csv")
If you are looking to read into a dataframe, this will also work in Julia:
using CSV
dataset = CSV.read(download("https://mywebsite.edu/ml/machine-learning-databases/my.data"))