How to avoid: read.table truncates numeric values beginning with 0

一世执手 提交于 2019-11-26 08:26:13

问题


I want to import a table (.txt file) in R with read.table(). One column in my table is an ID with nine numerals - some ids begin with a 0, other with 1 or 2.

R truncates the first 0 (012345678 becomes 12345678) which leads to problems when using this ID to merge another table.

Can someone give me a hint how to solve the problem?


回答1:


As said in Ben's answer, colClasses is the easier way to do it. Here is an example:

read.table(text = 'col1 col2
           0012 0001245',
           head=T,
           colClasses=c('character','numeric'))

  col1 col2
1 0012 1245      ## col1 keep 00 but not col2



回答2:


A reproducible example would be nice, but: use the colClasses argument to read.table() to specify that you want this column to be read as a character variable, not numeric. Or make them back into character variables after reading them in, using sprintf to pad the numbers with leading zeros. (The former is probably easier.)




回答3:


Here is a for loop to add leading zeros to rows based on a condition. Although this is a post-hoc solution (adding leading 0's after reading the table), it worked for me so thought I'd share:

Let's take the example of a column of zip codes. All values should contain 5 digits (e.g. 01234), but R removes leading zeros (so '01234' becomes '1234'). You can add a trailing zero to all cells that contain only 4 characters with this code:

for (i in 1:nrow(df)){
  if(nchar(df$zipCode[i])<5){
    df$zipCode[i]<- paste0('0',df$zipCode[i])
  }
}


来源:https://stackoverflow.com/questions/14854485/how-to-avoid-read-table-truncates-numeric-values-beginning-with-0

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!