Keep values in data frame= Na (sodium in chemistry) as is

牧云@^-^@ 提交于 2020-03-23 07:56:00

问题


Original df (clinical chemistry)

Subject Code Test Value Units   Flag
1       NA    NA   147   mmol/L    
2       NA/K  NA/K 10.5  RATIO  
3       K     K    4.7   mmol/L 
4       CK    CK   235   UL
...

Ideal df after cleaning

Subject Code  Test             Value  Units   Flag
1       NA    Sodium           147   mmol/L   NA
2       NA/K  Sodium Potassium 10.5  RATIO    NA
3       K     Potassium        4.7   mmol/L   NA
4       CK    Creatine Kinase  235    UL      NA
...

What I have tried

df <- read.csv(file="clinchemistry.csv", header = TRUE, sep=",", stringsAsFactors = FALSE)

df$df[df8$Test == "NA"] <- "Sodium"

df$df[df8$Code == "NA"] <- "Sodium"

and

df[is.na(lb$Code)]<-"Sodium"

lb[is.na(lb$Code)]<-"Sodium"

RESULTS:

All the sodium values disappear or get an error:

Error in [<-.data.frame(*tmp*, is.na(lb$Tesst), value = "Sodium") : duplicate subscripts for columns

WOULD SOMEONE GUIDE MY THINKING?


回答1:


Use na.strings=""

 df <- read.csv(file="clinchemistry.csv", 
     na.strings="", stringsAsFactors = FALSE)

(omitting arguments that are set to their default values)




回答2:


There are many ways to do what you want, depending on what it is that you want ;)

First I create a minimal example data.frame

df <- data.frame(Subject = 1:4,
                 Code = c(NA, "NA/K", "K", "CK"),
                 Test = c(NA, NA ,"K", "CR"))

Now, if for some reason your Sodium values are stored as NA (the missing value, not the string `"NA" you could do this (you really need an excellent reason to replace NAs, because in most cases this is going to be invented data. But your reason might be a valid one ;) :

# Replace missing values NA with string "Sodium"
# 
df$Code_fixed[is.na(df$Code)] <- "Sodium"

Or if you have a string "NA" that you want to change "Sodium"

# Replace string "NA" with string "Sodium"
# 
df$Code_fixed[df$Code == "NA"] <- "Sodium"

Or if you want to exchange the character combination "NA" in a string

# Replace any occurence of string "NA" with string "Sodium"
# 
df$Code_fixed <- gsub("NA", "Sodium", df$Code)

Do all of the above:

# First initialize vector with characters already replaced
df$Code_fixed <- gsub("NA", "Sodium", df$Code)
df$Code_fixed[is.na(df$Code_fixed)] <- "Sodium"


来源:https://stackoverflow.com/questions/60064695/keep-values-in-data-frame-na-sodium-in-chemistry-as-is

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!