Convert string categorical data in data frame to Numeric data

倖福魔咒の 提交于 2020-03-05 05:32:06

问题


I have the following values (800) in my data frame

cat1 <- c("bi", "bt", "ch", "fs", "hc", "lh", "mo", "ms", "nn", "ro", "sc", "si", "so", "ti", "ww", "dt", "3et", "a", "a", "a", "a", "a", "a", "aam", "aao", "ac", "acs", "aeo", "aeq", "afm", "aic", "aio", "akq", "am", "am", "am", "am", "amc", "amc", "aoq", "aoq", "aot", "apm", "apo", "apo", "aqf", "ass", "ata", "ata", "atc", "atf", "atq", "atr", "aun", "bae", "baf", "bai", "bcm", "bcs", "bea", "bee", "bef", "bem", "bem", "bem", "bem", "bem", "beo", "beo", "beq", "beq", "beq", "bhm", "bkr", "bm", "bm", "bme", "bmm", "bmm", "bmo", "bmq", "bmq", "brm", "brm", "brq", "bsm", "bsm", "bsm", "bsm", "bso", "bta", "bwa", "clm", "dd", "dm", "ne", "pp", "pv", "rt", "se", "sw")

I want to replace all string values with numeric values so that I can feed them in a neural network eg I want all "am" to be replaced with 5 or 0.5 and all "bem" to be replaced with 7 or 0.7 means according to some logic. Tried many things but able to achieve anything


回答1:


If you have know what is the replacement rule, you can establish a dictionary and using named variables for look up.

For instance,

cat1 <- c("bem","am","am","bem","am")
dict <- `names<-`(c(7,5),c("bem","am"))
res <- dict[cat1]

and you will get

> res
bem  am  am bem  am 
  7   5   5   7   5 


来源:https://stackoverflow.com/questions/59504814/convert-string-categorical-data-in-data-frame-to-numeric-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!