My data looks as follows:
DF <- structure(list(No_Adjusted_Gross_Income = c(183454, 241199, 249506
), NoR_from_1_to_5000 = c(1035373, 4272260, 1124098), NoR_f
The OP has asked in a comment for a grouping variable.
Although the accepted answer apparently does what the OP initially has asked for I would like to suggest a completey different approach where the data is stored and processed in tidy (long) format. IMHO, processing data in long format is much more straightforward and flexible (which includes aggregation & grouping).
For this, the dataset is reshaped from wide, Excel-style format to long, SQL-style format by
library(data.table)
col <- "NoR"
long <- melt(DF, measure.vars = patterns(col), value.name = col, variable.name = "range")
long[, range := stringr::str_remove(range, paste0(col, "_"))]
long
No_Adjusted_Gross_Income range NoR 1: 183454 from_1_to_5000 1035373 2: 241199 from_1_to_5000 4272260 3: 249506 from_1_to_5000 1124098 4: 183454 from_5000_to_10000 319540 5: 241199 from_5000_to_10000 4826042 6: 249506 from_5000_to_10000 1959866
In tidy (long) format there is one row for each observation and one column for each variable (see Chapter 12.2 of Hadley Wickham's book R for Data Science.
The vector of multipliers val
also needs to be reshaped from wide to long format:
valDF <- long[, .(range = unique(range), val)]
valDF
range val 1: from_1_to_5000 2500.5 2: from_5000_to_10000 7500.0
Now, valDF
is also in tidy format as there is one row for each range
.
Finally, we can add a new column AGI
to DF
by an update join:
long[valDF, on = "range", AGI := val * NoR][]
No_Adjusted_Gross_Income range NoR AGI 1: 183454 from_1_to_5000 1035373 2588950187 2: 241199 from_1_to_5000 4272260 10682786130 3: 249506 from_1_to_5000 1124098 2810807049 4: 183454 from_5000_to_10000 319540 2396550000 5: 241199 from_5000_to_10000 4826042 36195315000 6: 249506 from_5000_to_10000 1959866 14698995000
If required for presentation, the dataset can be reshaped back from long to wide format:
dcast(long, No_Adjusted_Gross_Income ~ range, value.var = c("NoR", "AGI"))
No_Adjusted_Gross_Income NoR_from_1_to_5000 NoR_from_5000_to_10000 AGI_from_1_to_5000 AGI_from_5000_to_10000 1: 183454 1035373 319540 2588950187 2396550000 2: 241199 4272260 4826042 10682786130 36195315000 3: 249506 1124098 1959866 2810807049 14698995000
which reproduces OP's expected result. Note that the variable names vn
are created automagically.
Aggregation and grouping can be performed while reshaping
dcast(long, No_Adjusted_Gross_Income ~ range, sum, value.var = c("NoR", "AGI"))
No_Adjusted_Gross_Income NoR_from_1_to_5000 NoR_from_5000_to_10000 AGI_from_1_to_5000 AGI_from_5000_to_10000 1: 183454 1035373 319540 2588950187 2396550000 2: 241199 4272260 4826042 10682786130 36195315000 3: 249506 1124098 1959866 2810807049 14698995000
or
dcast(long, No_Adjusted_Gross_Income ~ ., sum, value.var = c("NoR", "AGI"))
No_Adjusted_Gross_Income NoR AGI 1: 183454 1354913 4985500187 2: 241199 9098302 46878101130 3: 249506 3083964 17509802049
Alternatively, aggregation & grouping can be performed in long format:
long[, lapply(.SD, sum), .SDcols = c("NoR", "AGI"), by = No_Adjusted_Gross_Income]
No_Adjusted_Gross_Income NoR AGI 1: 183454 1354913 4985500187 2: 241199 9098302 46878101130 3: 249506 3083964 17509802049