r Replace only some table values with values from alternate table

梦想与她 提交于 2019-12-20 04:20:57

问题


This is not a "vlookup-and-fill-down" question.

My source data is excellent at delivering all the data I need, just not in in a usable form. Recent changes in volume mean manually adjusted fixes are no longer feasible.

I have an inventory table and a services table. The inventory report does not contain purchase order data for services or non-inventory items. The services table (naturally) does. They are of course different shapes.

Pseudo-coding would be something to the effect of for every inventory$Item in services$Item, replace inventory$onPO with services$onPO.

Sample Data

inv <- structure(list(Item = c("10100200", "10100201", "10100202", "10100203", 
"10100204", "10100205-A", "10100206", "10100207", "10100208", 
"10100209", "10100210"), onHand = c(600L, NA, 39L, 0L, NA, NA, 
40L, 0L, 0L, 0L, 0L), demand = c(3300L, NA, 40L, 40L, NA, NA, 
70L, 126L, 10L, 10L, 250L), onPO = c(2700L, NA, 1L, 40L, NA, 
NA, 30L, 126L, 10L, 10L, 250L)), .Names = c("Item", "onHand", 
"demand", "onPO"), row.names = c(NA, -11L), class = c("data.table", 
"data.frame"))

svc <- structure(list(Item = c("10100201", "10100204", "10100205-A"), 
    `Rcv'd` = c(0L, 0L, 44L), Backordered = c(20L, 100L, 18L)), .Names = c("Item", 
"Rcv'd", "Backordered"), row.names = c(NA, -3L), class = c("data.table", 
"data.frame"))

回答1:


Assuming you want to replace NAs in onPO with values from Backordered here is a solution using dplyr::left_join:

library(dplyr);
left_join(inv, svc) %>%
    mutate(onPO = ifelse(is.na(onPO), Backordered, onPO)) %>%
    select(-Backordered, -`Rcv'd`);
#         Item onHand demand onPO
#1    10100200    600   3300 2700
#2    10100201     NA     NA   20
#3    10100202     39     40    1
#4    10100203      0     40   40
#5    10100204     NA     NA  100
#6  10100205-A     NA     NA   18
#7    10100206     40     70   30
#8    10100207      0    126  126
#9    10100208      0     10   10
#10   10100209      0     10   10
#11   10100210      0    250  250

Or a solution in base R using merge:

inv$onPO <- with(merge(inv, svc, all.x = TRUE), ifelse(is.na(onPO), Backordered, onPO))

Or using coalesce instead of ifelse (thanks to @thelatemail):

library(dplyr);
left_join(inv, svc) %>%
    mutate(onPO = coalesce(onPO, Backordered)) %>%
    select(-Backordered, -`Rcv'd`);



回答2:


In data.table world, this is an "update-join". Join on "Item" and then update the values in the original set with the values from the new set:

library(data.table)
setDT(inv)
setDT(svc)

inv[svc, on="Item", c("onPO","onHand") := .(i.Backordered, `i.Rcv'd`)]

#inv   original table
#svc   update table
#on=   match on specified variable
# :=   overwrite  onPO    with  Backordered
#                 onHand  with  Rcv'd


#          Item onHand demand onPO
# 1:   10100200    600   3300 2700
# 2:   10100201      0     NA   20
# 3:   10100202     39     40    1
# 4:   10100203      0     40   40
# 5:   10100204      0     NA  100
# 6: 10100205-A     44     NA   18
# 7:   10100206     40     70   30
# 8:   10100207      0    126  126
# 9:   10100208      0     10   10
#10:   10100209      0     10   10
#11:   10100210      0    250  250



回答3:


Starting with the tables:

  >inv
          Item OnHand Demand OnPO
 1:   10100200    600   3300 2700
 2:   10100201     NA     NA   NA
 3:   10100202     39     40    1
 4:   10100203      0     40   40
 5:   10100204     NA     NA   NA
 6: 10100205-A     NA     NA   NA
 7:   10100206     40     70   30
 8:   10100207      0    126  126
 9:   10100208      0     10   10
10:   10100209      0     10   10
11:   10100210      0    250  250

> svc
         Item Rcv'd Backordered
1:   10100201     0          20
2:   10100204     0         100
3: 10100205-A    44          18

After far more cursing than I'd like to admit, the simple solution that works on the above test data, and my live data proved to be:

# Insert OnHand and OnPO data from svc
for (i in 1:nrow(inv)) {
  if(inv$Item[i] %in% svc$Item) {
    x <- which(svc$Item == inv$Item[i])
    inv$OnPO[i] <- svc$Backordered[x]
    inv$OnHand[i] <- svc$`Rcv'd`[x]
  } 
    else{}
}
# cleanup 
inv[is.na(inv)] <- 0

Is there a simpler or more obvious method that I've overlooked?




回答4:


We could use eat from my package safejoin, and "patch" the matches from the rhs into the lhs when columns conflict.

We rename Backordered to onPO on the way so the two columns conflict as desired.

# devtools::install_github("moodymudskipper/safejoin")
library(safejoin)
library(dplyr)

eat(inv, svc, onPO = Backordered, .conflict = "patch")
#          Item onHand demand onPO
# 1    10100200    600   3300 2700
# 2    10100201     NA     NA   20
# 3    10100202     39     40    1
# 4    10100203      0     40   40
# 5    10100204     NA     NA  100
# 6  10100205-A     NA     NA   18
# 7    10100206     40     70   30
# 8    10100207      0    126  126
# 9    10100208      0     10   10
# 10   10100209      0     10   10
# 11   10100210      0    250  250


来源:https://stackoverflow.com/questions/49374306/r-replace-only-some-table-values-with-values-from-alternate-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!