read.fwf

Reading fixed width format data into R with entries exceeding column width

大兔子大兔子 提交于 2020-06-28 08:24:25
问题 I need to use the Annual Building Permits by Metropolitan Area Data distributed by the US Census Bureau, which are downloadable here as fixed width format text files. Here is an excerpt of the file (I've stripped the column names as they aren't in a nice format and can be replaced after reading the file into a date frame): 999 10180 Abilene, TX 306 298 8 0 0 0 184 10420 Akron, OH 909 905 0 4 0 0 999 13980 Blacksburg-Christiansburg-Radford, VA 543 455 0 4 84 3 145 14010 Bloomington, IL 342 214

Finding bogus data in a pandas dataframe read with read_fwf()

孤人 提交于 2020-01-02 05:24:11
问题 I'm trying to analyse the weather records for New York, using the daily data taken from here: http://cdiac.ornl.gov/epubs/ndp/ushcn/daily_doc.html I'm loading the data with: tf = pandas.read_fwf(io.open('state30_NY.txt'), widths=widths, names=names, na_values=['-9999']) Where: >>> widths [6, 4, 2, 4, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5,

R readr::read_fwf ignore characters using fwf_widths

淺唱寂寞╮ 提交于 2019-12-12 01:44:17
问题 I would like to know if there is an easy way to skip characters using the read_fwf from the readr package in R. For example, modifying one of the examples in the documentation library(readr) fwf_sample <- system.file("extdata/fwf-sample.txt", package = "readr") read_fwf(fwf_sample, fwf_widths(c(2, -3,2, 3))) throws the error: Error: Begin offset (2) must be smaller than end offset (-1) Using the base read.fwf function works just fine however: read.fwf(fwf_sample, widths = c(2,-3,2,3)) # V1 V2

blank.lines.skip = TRUE fails with read.fwf?

耗尽温柔 提交于 2019-12-11 10:55:16
问题 There are four blank lines at the end of my file. > data=read.fwf("test2",head=F,widths=c(3,1,-3,4,-1,4),blank.lines.skip = TRUE) > data When I run this code, the blank.lines.skip argument is ignored. I still get blank lines in my output. The file is: x1 F 1890 1962 x2 1857 1936 x3 1900 1978 x4 1902 1994 x5 F 1878 1939 and four blank lines at the end. 回答1: It looks like you're right that blank.lines.skip does not apply to read.fwf -- would have to dig in the code to figure out why, but read

Finding bogus data in a pandas dataframe read with read_fwf()

一笑奈何 提交于 2019-12-05 08:41:33
I'm trying to analyse the weather records for New York, using the daily data taken from here: http://cdiac.ornl.gov/epubs/ndp/ushcn/daily_doc.html I'm loading the data with: tf = pandas.read_fwf(io.open('state30_NY.txt'), widths=widths, names=names, na_values=['-9999']) Where: >>> widths [6, 4, 2, 4, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1, 5, 1, 1, 1