Filling NA row values with nearest right side row value in R

前端未结

关注

 2  1514

I want to convert the given dataframe from

             c1     c2   c3   c4    c5
    VEG PUFF     12     78.43
CHICKEN PUFF &l


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  一生所求        
                
              
                            
                2020-12-21 19:38
              
            
            
                                                                       
Update

As there was lot of confusion on the expected output, updating the answer as suggested by @DavidArenburg using a tidyverse  solution

library(dplyr)
library(tidyr)
df %>%
  add_rownames() %>%
  gather(variable, value, -rowname) %>%
  filter(!is.na(value)) %>%
  group_by(rowname) %>%
  mutate(indx = row_number()) %>%
  select(-variable) %>%
  spread(indx, value)

#        rowname   `1`   `2`
#*        <chr> <dbl> <dbl>
#1 BAKERY_Total    28 84.04
#2 CHICKEN_PUFF    16 88.24
#3     VEG_PUFF    12 78.43




Another solution could be 

library(data.table)
temp <- apply(df, 1, function(x) data.frame(matrix(x[!is.na(x)], nrow = 1)))
rbindlist(temp, fill = T)




Previous Answer

If I have understand you correctly, you are trying to replace NA values in a row with the latest non-NA value in the same row

We can use na.locf with fromLast set as TRUE

t(apply(df, 1, function(x) na.locf(x, fromLast = T, na.rm = F)))


#             c1 c2    c3    c4    c5
#VEG_PUFF     12 12 78.43 78.43 78.43
#CHICKEN_PUFF 16 16 88.24 88.24    NA
#BAKERY_Total 28 28 28.00 84.04 84.04

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  太阳男子        
                
              
                            
                2020-12-21 19:43
              
            
            
                                                                       
We can use na.omit

t(apply(df, 1, na.omit))
#             [,1]  [,2]
#VEG PUFF       12 78.43
#CHICKEN PUFF   16 88.24
#BAKERY Total   28 84.04


Update

Based on the excel data showed

lst <- apply(df, 1, na.omit)
df2 <- do.call(rbind, lapply(lst, `length<-`, max(lengths(lst))))
row.names(df2) <- row.names(df)




Or another option is melt/dcast from data.table

library(data.table)
dcast(melt(setDT(df1, keep.rownames=TRUE), id.var = 'rn', 
         na.rm = TRUE), rn~ paste0("c", rowid(rn)), value.var = "value")
#             rn c1    c2  c3
#1: BAKERY Total 28 84.04  NA
#2: CHICKEN PUFF 16 88.24 143
#3:     VEG PUFF 12 78.43  NA


To provide a reproducible example, 

df1 <- structure(list(c1 = c(NA, NA, NA), c2 = c(12L, 16L, NA), c3 = c(NA, 
NA, 28L), c4 = c(NA, 88.24, NA), c5 = c(78.43, 143, 84.04)), .Names = c("c1", 
"c2", "c3", "c4", "c5"), class = "data.frame", row.names = c("VEG PUFF", 
"CHICKEN PUFF", "BAKERY Total"))

lst <- lapply(seq_len(nrow(df1)), function(i) {
               x1 <- unlist(df1[i,])
               x1[complete.cases(x1)]})
df2 <- do.call(rbind, lapply(lst, `length<-`, max(lengths(lst))))
row.names(df2) <- row.names(df1)


The above approach is similar to the apply method except that we can be always sure that this output a list (in the apply - it can vary.  When the number of elements are the same after removing the NA, it will output a matrix, in other cases a list).  So, we loop over the sequence of rows, remove the NA elements, pad NA at the end to make lengths of list elements same and then rbind



Or another option is which with arr.ind=TRUE

ind <- which(!is.na(df), arr.ind=TRUE)
matrix(df[ind[order(ind[,1]),]], ncol=2, byrow=TRUE, 
            dimnames = list(row.names(df), paste0("c", 1:2)))
#             c1    c2
#VEG PUFF     12 78.43
#CHICKEN PUFF 16 88.24
#BAKERY Total 28 84.04

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复