I\'m having trouble figuring out how to effective map across multiple parameters and variables within a tbl to generate new variables.
In the \"real\" version, I basica
I think one of the issues making this task difficult is the current set up might not be very "tidy". E.g. low.a
, low.b
, med.a
etc appear to be examples of what I understand to be 'untidy' columns.
Below is one possible approach (which I am fairly sure can probably be improved) which doesn't use a for loop or custom function at all. The key idea is to take the initial pracdf
and expand the existing rows so there is one row for each "level" (i.e., low, med, and high). Doing this lets us calculate d
in a single step with no for loops for low, med, and high.
(Edited for readability and to include Jens Leerssen's suggestions)
library(dplyr)
library(tidyr)
set.seed(123)
pracdf <- tibble(ID = letters,
p = runif(26, 100, 1000),
a = runif(26),
b = runif(26),
c = runif(26))
levdf <- tibble(level = c("low", "med", "high"),
level_val = c(0.8, 1.0, 1.2))
tidy_df <- pracdf %>% merge(levdf) %>%
mutate(d = p * (level_val * a) * (level_val * b) * c) %>%
select(-level_val) %>% arrange(ID) %>% as_tibble()
tidy_df
#> # A tibble: 78 x 7
#> ID p a b c level d
#>
#> 1 a 358.8198 0.5440660 0.7989248 0.3517979 low 35.116168
#> 2 a 358.8198 0.5440660 0.7989248 0.3517979 med 54.869013
#> 3 a 358.8198 0.5440660 0.7989248 0.3517979 high 79.011379
#> 4 b 809.4746 0.5941420 0.1218993 0.1111354 low 4.169914
#> 5 b 809.4746 0.5941420 0.1218993 0.1111354 med 6.515490
#> 6 b 809.4746 0.5941420 0.1218993 0.1111354 high 9.382306
#> 7 c 468.0792 0.2891597 0.5609480 0.2436195 low 11.837821
#> 8 c 468.0792 0.2891597 0.5609480 0.2436195 med 18.496595
#> 9 c 468.0792 0.2891597 0.5609480 0.2436195 high 26.635096
#> 10 d 894.7157 0.1471136 0.2065314 0.6680556 low 11.622957
#> # ... with 68 more rows
However, the result above might not be the format you want the final data in. But we can take care of this by doing some gathering and spreading of tidy_df
using tidyr::gather
and tidyr::spread
.
tidy_df %>%
gather(variable, value, a, b, d) %>%
unite(level_variable, level, variable) %>%
spread(level_variable, value)
#> # A tibble: 26 x 12
#> ID p c high_a high_b high_d low_a
#> *
#> 1 a 358.8198 0.3517979 0.54406602 0.79892485 79.011379 0.54406602
#> 2 b 809.4746 0.1111354 0.59414202 0.12189926 9.382306 0.59414202
#> 3 c 468.0792 0.2436195 0.28915974 0.56094798 26.635096 0.28915974
#> 4 d 894.7157 0.6680556 0.14711365 0.20653139 26.151654 0.14711365
#> 5 e 946.4206 0.4176468 0.96302423 0.12753165 69.905442 0.96302423
#> 6 f 141.0008 0.7881958 0.90229905 0.75330786 108.778072 0.90229905
#> 7 g 575.2949 0.1028646 0.69070528 0.89504536 52.681362 0.69070528
#> 8 h 903.1771 0.4348927 0.79546742 0.37446278 168.480110 0.79546742
#> 9 i 596.2915 0.9849570 0.02461368 0.66511519 13.845603 0.02461368
#> 10 j 510.9533 0.8930511 0.47779597 0.09484066 29.775361 0.47779597
#> # ... with 16 more rows, and 5 more variables: low_b , low_d ,
#> # med_a , med_b , med_d