Using R to parse out Surveymonkey csv files

后端 未结 7 1044
说谎
说谎 2021-02-05 12:01

I\'m trying to analyse a large survey created with surveymonkey which has hundreds of columns in the CSV file and the output format is difficult to use as the headers run over t

7条回答
  •  臣服心动
    2021-02-05 12:54

    I have to deal with this pretty frequently, and having the headers on two columns is a bit painful. This function fixes that issue so that you only have a 1 row header to deal with. It also joins the multipunch questions so you have top: bottom style naming.

    #' @param x The path to a surveymonkey csv file
    fix_names <- function(x) {
      rs <- read.csv(
        x,
        nrows = 2,
        stringsAsFactors = FALSE,
        header = FALSE,
        check.names = FALSE, 
        na.strings = "",
        encoding = "UTF-8"
      )
    
      rs[rs == ""] <- NA
      rs[rs == "NA"] <- "Not applicable"
      rs[rs == "Response"] <- NA
      rs[rs == "Open-Ended Response"] <- NA
    
      nms <- c()
    
      for(i in 1:ncol(rs)) {
    
        current_top <- rs[1,i]
        current_bottom <- rs[2,i]
    
        if(i + 1 < ncol(rs)) {
          coming_top <- rs[1, i+1]
          coming_bottom <- rs[2, i+1]
        }
    
        if(is.na(coming_top) & !is.na(current_top) & (!is.na(current_bottom) | grepl("^Other", coming_bottom)))
          pre <- current_top
    
        if((is.na(current_top) & !is.na(current_bottom)) | (!is.na(current_top) & !is.na(current_bottom)))
          nms[i] <- paste0(c(pre, current_bottom), collapse = " - ")
    
        if(!is.na(current_top) & is.na(current_bottom))
          nms[i] <- current_top
    
      }
    
    
      nms
    }
    

    If you note, it returns the names only. I typically just read.csv with ...,skip=2, header = FALSE, save to a variable and overwrite the names of the variable. It also helps ALOT to set your na.strings and stringsAsFactor = FALSE.

    nms = fix_names("path/to/csv")
    d = read.csv("path/to/csv", skip = 2, header = FALSE)
    names(d) = nms 
    

提交回复
热议问题