Replace all occurrences of a string in a data frame

后端 未结 6 668
面向向阳花
面向向阳花 2020-11-27 02:54

I\'m working on a data frame that has non-detects which are coded with \'<\'. Sometimes there is a space after the \'<\' and sometimes not e.g. \'<2\' or \'< 2\

相关标签:
6条回答
  • 2020-11-27 03:15

    Here is a dplyr solution

    library(dplyr)
    library(stringr)
    
    Censor_consistently <-  function(x){
      str_replace(x, '^\\s*([<>])\\s*(\\d+)', '\\1\\2')
    }
    
    
    test_df <- tibble(x = c('0.001', '<0.002', ' < 0.003', ' >  100'),  y = 4:1)
    
    mutate_all(test_df, funs(Censor_consistently))
    
    # A tibble: 4 × 2
    x     y
    <chr> <chr>
    1  0.001     4
    2 <0.002     3
    3 <0.003     2
    4   >100     1
    
    0 讨论(0)
  • 2020-11-27 03:19

    If you are only looking to replace all occurrences of "< " (with space) with "<" (no space), then you can do an lapply over the data frame, with a gsub for replacement:

    > data <- data.frame(lapply(data, function(x) {
    +                  gsub("< ", "<", x)
    +              }))
    > data
      name var1 var2
    1    a   <2   <3
    2    a   <2   <3
    3    a   <2   <3
    4    b   <2   <3
    5    b   <2   <3
    6    b   <2   <3
    7    c   <2   <3
    8    c   <2   <3
    9    c   <2   <3
    
    0 讨论(0)
  • 2020-11-27 03:19

    Equivalent to "find and replace." Don't overthink it.

    Try it with one:

    library(tidyverse)
    df <- data.frame(name = rep(letters[1:3], each = 3), var1 = rep('< 2', 9), var2 = rep('<3', 9))
    
    df %>% 
      mutate(var1 = str_replace(var1, " ", ""))
    #>   name var1 var2
    #> 1    a   <2   <3
    #> 2    a   <2   <3
    #> 3    a   <2   <3
    #> 4    b   <2   <3
    #> 5    b   <2   <3
    #> 6    b   <2   <3
    #> 7    c   <2   <3
    #> 8    c   <2   <3
    #> 9    c   <2   <3
    

    Apply to all

    df %>% 
      mutate_all(funs(str_replace(., " ", "")))
    #>   name var1 var2
    #> 1    a   <2   <3
    #> 2    a   <2   <3
    #> 3    a   <2   <3
    #> 4    b   <2   <3
    #> 5    b   <2   <3
    #> 6    b   <2   <3
    #> 7    c   <2   <3
    #> 8    c   <2   <3
    #> 9    c   <2   <3
    

    If the extra space was produced by uniting columns, think about making str_trim part of your workflow.

    Created on 2018-03-11 by the reprex package (v0.2.0).

    0 讨论(0)
  • 2020-11-27 03:21

    I had the problem, I had to replace "Not Available" with NA and my solution goes like this

    data <- sapply(data,function(x) {x <- gsub("Not Available",NA,x)})
    
    0 讨论(0)
  • 2020-11-27 03:30

    To remove all spaces in every column, you can use

    data[] <- lapply(data, gsub, pattern = " ", replacement = "", fixed = TRUE)
    

    or to constrict this to just the second and third columns (i.e. every column except the first),

    data[-1] <- lapply(data[-1], gsub, pattern = " ", replacement = "", fixed = TRUE)
    
    0 讨论(0)
  • 2020-11-27 03:34

    late to the party. but if you only want to get rid of leading/trailing white space, R base has a function trimws

    For example:

    data <- apply(X = data, MARGIN = 2, FUN = trimws) %>% as.data.frame()
    
    0 讨论(0)
提交回复
热议问题