Convert data.frame columns from factors to characters

前端 未结 18 1120
时光取名叫无心
时光取名叫无心 2020-11-22 04:43

I have a data frame. Let\'s call him bob:

> head(bob)
                 phenotype                         exclusion
GSM399350 3- 4- 8- 25- 44+         


        
相关标签:
18条回答
  • 2020-11-22 05:03

    To replace only factors:

    i <- sapply(bob, is.factor)
    bob[i] <- lapply(bob[i], as.character)
    

    In package dplyr in version 0.5.0 new function mutate_if was introduced:

    library(dplyr)
    bob %>% mutate_if(is.factor, as.character) -> bob
    

    ...and in version 1.0.0 was replaced by across:

    library(dplyr)
    bob %>% mutate(across(where(is.factor), as.character)) -> bob
    

    Package purrr from RStudio gives another alternative:

    library(purrr)
    bob %>% modify_if(is.factor, as.character) -> bob
    
    0 讨论(0)
  • 2020-11-22 05:03

    This function does the trick

    df <- stacomirtools::killfactor(df)
    
    0 讨论(0)
  • 2020-11-22 05:05

    This works transforming all to character and then the numeric to numeric:

    makenumcols<-function(df){
      df<-as.data.frame(df)
      df[] <- lapply(df, as.character)
      cond <- apply(df, 2, function(x) {
        x <- x[!is.na(x)]
        all(suppressWarnings(!is.na(as.numeric(x))))
      })
      numeric_cols <- names(df)[cond]
      df[,numeric_cols] <- sapply(df[,numeric_cols], as.numeric)
      return(df)
    }
    

    Adapted from: Get column types of excel sheet automatically

    0 讨论(0)
  • 2020-11-22 05:09

    Update: Here's an example of something that doesn't work. I thought it would, but I think that the stringsAsFactors option only works on character strings - it leaves the factors alone.

    Try this:

    bob2 <- data.frame(bob, stringsAsFactors = FALSE)
    

    Generally speaking, whenever you're having problems with factors that should be characters, there's a stringsAsFactors setting somewhere to help you (including a global setting).

    0 讨论(0)
  • 2020-11-22 05:11

    The global option

    stringsAsFactors: The default setting for arguments of data.frame and read.table.

    may be something you want to set to FALSE in your startup files (e.g. ~/.Rprofile). Please see help(options).

    0 讨论(0)
  • 2020-11-22 05:11

    Maybe a newer option?

    library("tidyverse")
    
    bob <- bob %>% group_by_if(is.factor, as.character)
    
    0 讨论(0)
提交回复
热议问题