CsvProvider throws OutOfMemoryException

佐手、 提交于 2019-12-11 00:19:12

问题


FAOCropsLivestock.csv contains more than 14 million row. In my .fs file I have declared

type FAO = CsvProvider<"c:\FAOCropsLivestock.csv">

and tried to work with follwoing code

FAO.GetSample().Rows.Where(fun x -> x.Country = country) |> ....
FAO.GetSample().Filter(fun x -> x.Country = country) |> ....

In both cases, exception was thrown.

I also have tried with follwoing code after loading the csv file in MSSQL Server

type Schema = SqlDataConnection<conStr>
let db = Schema.GetDataContext()
db.FAOCropsLivestock.Where(fun x-> x.Country = country) |> ....

it works. It also works if I issue query using OleDb connection, but it is slow.

How can I get a squence out of it using CsvProvider?


回答1:


If you refer to the bottom of the CSV Type Provider documentation, you will see a section on handling large datasets. As explained there, you can set CacheRows = false which will aid you when it comes to handling large datasets.

type FAO = CsvProvider<"c:\FAOCropsLivestock.csv", CacheRows = false>

You can then use standard sequence operations over the rows of the CSV as a sequence without loading the entire file into memory. e.g.

FAO.GetSample().Rows |> Seq.filter (fun x -> x.Country = country) |> ....

You should, however, take care to only enumerate the contents once.



来源:https://stackoverflow.com/questions/40852191/csvprovider-throws-outofmemoryexception

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!