问题
FAOCropsLivestock.csv
contains more than 14 million row. In my .fs
file I have declared
type FAO = CsvProvider<"c:\FAOCropsLivestock.csv">
and tried to work with follwoing code
FAO.GetSample().Rows.Where(fun x -> x.Country = country) |> ....
FAO.GetSample().Filter(fun x -> x.Country = country) |> ....
In both cases, exception
was thrown.
I also have tried with follwoing code after loading the csv
file in MSSQL Server
type Schema = SqlDataConnection<conStr>
let db = Schema.GetDataContext()
db.FAOCropsLivestock.Where(fun x-> x.Country = country) |> ....
it works. It also works if I issue query
using OleDb
connection, but it is slow.
How can I get a squence out of it using CsvProvider
?
回答1:
If you refer to the bottom of the CSV Type Provider documentation, you will see a section on handling large datasets. As explained there, you can set CacheRows = false
which will aid you when it comes to handling large datasets.
type FAO = CsvProvider<"c:\FAOCropsLivestock.csv", CacheRows = false>
You can then use standard sequence operations over the rows of the CSV as a sequence without loading the entire file into memory. e.g.
FAO.GetSample().Rows |> Seq.filter (fun x -> x.Country = country) |> ....
You should, however, take care to only enumerate the contents once.
来源:https://stackoverflow.com/questions/40852191/csvprovider-throws-outofmemoryexception