fread skip and autostart issue

天涯浪子 提交于 2019-12-24 08:26:43

问题


I have the following code:

raw_test <- fread("avito_test.tsv", nrows = intNrows, skip = intSkip)

Which produces the following error:

Error in fread("avito_test.tsv", nrows = intNrows, skip = intSkip, autostart = (intSkip +  : 
  Expected sep (',') but new line, EOF (or other non printing character) ends field 14 on line 1003 when detecting types: 10066652  ТранÑпорт   Ðвтомобили Ñ Ð¿Ñ€Ð¾Ð±ÐµÐ³Ð¾Ð¼  Nissan R Nessa, 1998    Ð¢Ð°Ñ€Ð°Ð½Ñ‚Ð°Ñ Ð² отличном ÑоÑтоÑнии. на прошлой неделе возили на тех. ОбÑлуживание. Ð’ дорожных неприÑтноÑÑ‚ÑÑ… не был учаÑтником. Детали кузова без коцок и терок. ПредназначалаÑÑŒ Ð´Ð»Ñ Ð¿Ð¾ÐµÐ·Ð´Ð¾Ðº на природу, Отдам только в добрые руки. Ð’ Ñалон не поÑтавлю не звоните    "{""Марка"":""Nissan"", ""Модель"":""R Nessa"", ""Год выпуÑка"":""1998"", ""Пробег"":""180 000 - 189 999"", ""Тип кузова"":""МинивÑн"", ""Цвет"":""Оранжевый"", ""Объём двигателÑ"":""2.4"", ""Коробка передач"":""МеханичеÑкаÑ

I have tried changing it to this:

raw_test <- fread("avito_test.tsv", nrows = intNrows, skip = intSkip, autostart = (intSkip + 2))

Which is based on what I read on a similar question skip and autostart in fread

However, it produces a similar error as above.

How can I skip the first 1000 rows, and read the next thousand? My expected output is 1000 rows total, skipping the first thousand from my CSV file, and reading the second thousand.

Note: Reading the file with raw_test <- fread("avito_test.tsv", nrows = 1000, skip = -1) works well for getting me only the first thousand, but I am trying to get only the second thousand.

Edit: The data is publicly available at http://www.kaggle.com/c/avito-prohibited-content/data

Edit: Environment and package info:

> packageVersion("data.table")
[1] ‘1.9.3’
> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

来源:https://stackoverflow.com/questions/24759346/fread-skip-and-autostart-issue

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!