readr | 易学教程

big integers when reading file with readr in r

阅读更多关于 big integers when reading file with readr in r

问题 I wanted to use the readr package since I will work on some bigger files in the future. My problem is, that there is a column called Intensity which has some very big values (e.g. 5493500000 ). My problem is, the first time this big value appears is in line 2200 and readr already defined the column as integer instead of numeric and produces a buffer overflow. Is there a way to only provide a single column type to the read_tsv function, since I don't want to provide all (about) 40 columns the

write_csv read_csv with scientific notation after 1000th row

阅读更多关于 write_csv read_csv with scientific notation after 1000th row

Writing a data frame with a mix of small integer entries (value less than 1000) and "large" ones (value 1000 or more) into csv file with write_csv() mixes scientific and non-scientific entries. If the first 1000 rows are small values but there is a large value thereafter, read_csv() seems to get confused with this mix and outputs NA for scientific notations: test_write_read <- function(small_value, n_fills, position, large_value) { tib <- tibble(a = rep(small_value, n_fills)) tib$a[position] <- large_value write_csv(tib, "tib.csv") tib <- read_csv("tib.csv") } The following lines do not make

What are permissible column objects of the form “col_*()” used in readr?

阅读更多关于 What are permissible column objects of the form “col_*()” used in readr?

问题 readr::read_csv is misreading some column types in a file I am loading so I want to use cols to set them manually. In ?read_csv , it says the col_types argument should be _"One of ‘NULL’, a ‘cols()’ specification, or a string. See ‘vignette("column-types")’ for more details". Well, vignette("column-types") gives vignette("column-types") not found so I tried ?cols . It says it accepts "column objects created by ‘col_*()’ or their abbreviated character names". What are the acceptable functions

passing named list to cols_only() [closed]

阅读更多关于 passing named list to cols_only() [closed]

When I try to do something like this: data <- read_csv("blah.csv", + n_max = 100, + col_types = cols_only(list(files = "c")) + ) Error: Some `col_types` are not S3 collector objects: 1 so question is whether it is possible to pass a named list to cols_only() Sure, just use do.call to use the list as the parameters for the function, e.g. library(readr) read_csv(system.file('extdata', 'mtcars.csv', package = 'readr'), # sample data from readr col_types = do.call(cols_only, list(cyl = 'i'))) #> # A tibble: 32 × 1 #> cyl #> <int> #> 1 6 #> 2 6 #> 3 4 #> 4 6 #> 5 8 #> 6 6 #> 7 8 #> 8 4 #> 9 4 #> 10

passing named list to cols_only() [closed]

阅读更多关于 passing named list to cols_only() [closed]

问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 2 years ago . When I try to do something like this: data <- read_csv("blah.csv", + n_max = 100, + col_types = cols_only(list(files = "c")) + ) Error: Some `col_types` are not S3 collector objects: 1 so question is whether it is possible to pass a named list to cols_only() 回答1: Sure, just use do.call to use the list as the

How to use “cols()” and “col_double” with respect to comma as decimal mark

阅读更多关于 How to use “cols()” and “col_double” with respect to comma as decimal mark

I would like to parse my columns with the readr package to the right type while reading. Difficulty: the fields are separated by semicolon ( ; ), while comma ( , ) is used as decimal mark. library(readr) # Test data: T <- "Date;Time;Var1;Var2 01.01.2011;11:11;2,4;5,6 02.01.2011;12:11;2,5;5,5 03.01.2011;13:11;2,6;5,4 04:01.2011;14:11;2,7;5,3" read_delim(T, ";") # A tibble: 4 × 4 # Date Time Var1 Var2 # <chr> <time> <dbl> <dbl> # 1 01.01.2011 11:11:00 24 56 # 2 02.01.2011 12:11:00 25 55 # 3 03.01.2011 13:11:00 26 54 # 4 04:01.2011 14:11:00 27 53 So, I thought the parsing thing would work like

readr::read_csv issue: Chinese Character becomes messy codes

阅读更多关于 readr::read_csv issue: Chinese Character becomes messy codes

I'm trying to import a dataset to RStudio, however I am stuck with Chinese characters, as they become messy codes. Here is the code: library(tidyverse) df <- read_csv("中文,英文\n英文,德文") df # A tibble: 1 x 2 `\xd6\xd0\xce\xc4` `Ӣ\xce\xc4` <chr> <chr> 1 "<U+04E2>\xce\xc4" "<U+00B5>\xc2\xce\xc4" When I use the base function read.csv, it works well. I guess I must do something wrong with encoding. But there are no encoding option in read_csv, how can I do this? This is because that the characters are marked as UTF-8 whereas the actual encoding is the system default (you can get by stringi::stri_enc

Suppress reader parse problems in r

阅读更多关于 Suppress reader parse problems in r

I am currently reading in a file using the package readr . The idea is to use read_delim to read in row for row to find the maximum columns in my unstructured data file. The code outputs that there are parsing problems. I know of these and will deal with column type after import. Is there a way to turn off the problems() as the usual options(warn) is not working i=1 max_col <- 0 options(warn = -1) while(i != "stop") { n_col<- ncol(read_delim("file.txt", n_max = 1, skip = i, delim="\t")) if(n_col > max_col) { max_col <- n_col print(max_col) } i <- i+1 if(n_col==0) i<-"stop" } options(warn = 0)

How can I write dplyr groups to separate files?

阅读更多关于 How can I write dplyr groups to separate files?

I'm trying to create separate .csv files for each group in a data frame grouped with dplyr's group_by function. So far I have something like by_cyl <- group_by(mtcars, cyl) do(by_cyl, write_csv(., "test.csv")) As expected, this writes a single .csv file with only the data from the last group. How can I modify this to write multiple .csv files, each with filenames that include cyl? You can wrap the csv write process in a custom function as follows. Note that the function has to return a data.frame else it returns an error Error: Results are not data frames at positions This will return 3 csv

Suppress reader parse problems in r

阅读更多关于 Suppress reader parse problems in r

问题 I am currently reading in a file using the package readr . The idea is to use read_delim to read in row for row to find the maximum columns in my unstructured data file. The code outputs that there are parsing problems. I know of these and will deal with column type after import. Is there a way to turn off the problems() as the usual options(warn) is not working i=1 max_col <- 0 options(warn = -1) while(i != "stop") { n_col<- ncol(read_delim("file.txt", n_max = 1, skip = i, delim="\t")) if(n