generate column values with multiple conditions in R

后端未结

关注

 8  2006

I have a dataframe z and I want to create the new column based on the values of two old columns of z. Following is the process:

&g


                      
              相关标签:


      
      
        
          8条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2020-12-29 00:51
              
            
            
                                                                       
Generate a multipler vector:

tt <- rep(1, max(z$x))
tt[2] <- 2
tt[4] <- 4
tt[7] <- 3


And here is your new column:

> z$t * tt[z$x]
 [1] 21 44 23 96 25 26 81 28 29 30

> z$q <- z$t * tt[z$x]
> z
    x  y  t  q
1   1 11 21 21
2   2 12 22 44
3   3 13 23 23
4   4 14 24 96
5   5 15 25 25
6   6 16 26 26
7   7 17 27 81
8   8 18 28 28
9   9 19 29 29
10 10 20 30 30


This will not work if there are negative values in z$x.

Edited

Here is a generalization of the above, where a function is used to generate the multiplier vector.  In fact, we create a function based on parameters.

We want to transform the following values:

2 -> 2
4 -> 4
7 -> 3


Otherwise a default of 1 is taken. 

Here is a function which generates the desired function:

f <- function(default, x, y) {
  x.min <- min(x)
  x.max <- max(x)
  y.vals <- rep(default, x.max-x.min+1)
  y.vals[x-x.min+1] <- y

  function(z) {
    result <- rep(default, length(z))
    tmp <- z>=x.min & z<=x.max
    result[tmp] <- y.vals[z[tmp]-x.min+1]
    result
  }
}


Here is how we use it:

x <- c(2,4,7)
y <- c(2,4,3)

g <- f(1, x, y)


g is the function that we want.  It should be clear that any mapping can be supplied via the x and y parameters to f.

g(z$x)
## [1] 1 2 1 4 1 1 3 1 1 1

g(z$x)*z$t
## [1] 21 44 23 96 25 26 81 28 29 30


It should be clear this only works for integer values.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  感动是毒        
                
              
                            
                2020-12-29 00:56
              
            
            
                                                                       
Here's a version of an SQL decode in R for character vectors (untested with factors) that operates just like the SQL version. i.e. it takes an arbitrary number of target/replacement pairs, and optional last argument that acts as a default value (note that the default won't overwrite NAs).

I can see it being pretty useful in conjunction with dplyr's mutate operation.

> x <- c("apple","apple","orange","pear","pear",NA)

> decode(x, apple, banana)
[1] "banana" "banana" "orange" "pear"   "pear"   NA      

> decode(x, apple, banana, fruit)
[1] "banana" "banana" "fruit"  "fruit"  "fruit"  NA      

> decode(x, apple, banana, pear, passionfruit)
[1] "banana"       "banana"       "orange"       "passionfruit" "passionfruit" NA            

> decode(x, apple, banana, pear, passionfruit, fruit)
[1] "banana"       "banana"       "fruit"        "passionfruit" "passionfruit" NA  


Here's the code I'm using, with a gist I'll keep up to date here (link).

decode <- function(x, ...) {

  args <- as.character((eval(substitute(alist(...))))

  replacements <- args[1:length(args) %% 2 == 0]
  targets      <- args[1:length(args) %% 2 == 1][1:length(replacements)]

  if(length(args) %% 2 == 1)
    x[! x %in% targets & ! is.na(x)] <- tail(args,1)

  for(i in 1:length(targets))
    x <- ifelse(x == targets[i], replacements[i], x)

  return(x)

}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     上一页
1
2
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复