How to correct the encoding of characters on a data.frame

Deadly 提交于 2019-12-08 04:00:22

问题


I have a data frame make like this:

data.names<-data.frame(DATA=c(1:5))
rownames(data.names)<-c("IV\xc1N","JOS\xc9","LUC\xcdA","RAM\xd3N","TO\xd1O")
data.names
#          DATA
# IV\xc1N     1
# JOS\xc9     2
# LUC\xcdA    3
# RAM\xd3N    4
# TO\xd1O     5

I want the incorrect letters replace by the right ones (Á,É,Í,...). Make clear that I want to use apply because I read that is much more efficient apply than for. My idea is make a function that changes these letters:

letters1<-c("\xc1","\xc9","\xcd","\xd3", "\xd1") #Á,É,Í,Ó,Ñ
letters2<-c("Á","É","Í","Ó","Ñ")
change.names <- function(x){sub(letters1[x], letters2[x],rownames(data.names))}

Now, with a for I haven't any problems:

for(i in 1:5) rownames(data.names)<-change.names(i)
data.names
#       DATA
# IVÁN     1
# JOSÉ     2
# LUCÍA    3
# RAMÓN    4
# TOÑO     5

But I don't have much idea how to do it with apply. I've tried:

apply(matrix(c(1:5),ncol=5),2,change.names)

And the output is a matrix with 5 columns, where each one only changes one letter and I can't know how to assign to rownames(data.names) a "mix" of them, or something that works.


回答1:


You don't even need to use apply, because rownames(data.names) is a vector and vectors may be recycled

> Encoding(rownames(data.names)) <- 'latin1'
> data.names
         DATA
IVÁN        1
JOSÉ        2
LUCÍA       3
RAMÓN       4
TOÑO        5

Please read this answer for more details about the encoding.



来源:https://stackoverflow.com/questions/34317311/how-to-correct-the-encoding-of-characters-on-a-data-frame

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!