问题
I'm aware of the general ins and outs of the different OO systems in R and I'm all for staying in S3 whenever/as long as I can.
However, the need for multiple dispatch keeps haunting me in professional projects and kept pulling me into S4 in the past.
But I refuse to accept that I can't find a S3 emulation of multi dispatch à la S4 that is at least able to support my primary use case of avoiding unsystematic data drift:
- My analytical programs create data structures that I publish to stakeholders for them to consume as part of some data pipelines
- Usually I publish to a DBMS such as MongoDB, thus my structures somewhat manifest themselves and could/should be thought of as DB schemas
- I follow a "release early, release often" approach to stay agile, so It's natural that "things change" - often and over time
- As data structures eventually evolve, I want to be explicit about describing the structure so I can define internal migration routines and/or make stakeholders aware of which version of data structures they are dealing with in the DB
If I go about describing my data structures via some versioning scheme (which is the best I could think of so far), I basically need to base my methods on three signature arguments regarding method dispatch:
- One for the input (class) of a given data transformation task
- One for the version of the input (class)
- One for the version of the output (class)
Due diligence
Apart from the OO field guide I've read:
- Multiple dispatch for subset methods in R
- Multi dispatch gist by Winston Chang
My approach
Sequential S3-based approach for an input_class
and an output_class
that both can evolve over time structure-wise, which is captures by respective version classes such as v1
, v2
etc.
Step 1: Define necessary base components
# Generic that dispatches on `x`
new_output_class <- function(x, schema_version_in, schema_version_out, ...) {
UseMethod("new_output_class", x)
}
# Generic that dispatches on 'schema_version_in' and
# can thus be used to define distinct methods for
# different versions of 'input_class'
new_output_class__input_class <- function(x, schema_version_in,
schema_version_out, ...) {
UseMethod("new_output_class__input_class", schema_version_in)
}
# Generic that dispatches on 'schema_version_out' and
# can thus be used to define distinct methods for
# different versions of 'output_class' for inputs of class `input_class`
# with version class `v1`
new_output_class__input_class__v1 <- function(x, schema_version_in,
schema_version_out, ...) {
UseMethod("new_output_class__input_class__v1", schema_version_out)
}
# Top-level method for dispatch based on `input_class`
new_output_class.input_class <- function(
x,
schema_version_in,
schema_version_out
) {
message("Level 1: method for `input_class`")
new_output_class__input_class(x, schema_version_in, schema_version_out)
}
Step 2: Define methods for the initial case of input_class = v1
and output_class = v1
library(magrittr)
library(stringr)
# Method for `input_class` of version `v1`
new_output_class__input_class.v1 <- function(
x,
schema_version_in,
schema_version_out
) {
# message("Sub-level 1: `input_class/in:v1`")
message(stringr::str_glue("Level 2: method for `input_class/in:{schema_version_in}`"))
new_output_class__input_class__v1(x, schema_version_in, schema_version_out)
}
# Method for `output_class` of version `v1` for inputs
# of `input_class` of version `v1`
new_output_class__input_class__v1.v1 <- function(
x,
schema_version_in,
schema_version_out
) {
message(stringr::str_glue("Level 3: method for `input_class/in:{schema_version_in}/out:{schema_version_out}`"))
structure(x,
class = c(stringr::str_glue("output_class_{schema_version_out}"),
"output_class")
)
}
x <- structure(letters[1:3], class = "input_class")
schema_version_of_x <- structure("v1", class = "v1")
schema_version_of_new_output <- structure("v1", class = "v1")
new_output_class(x, schema_version_of_x, schema_version_of_new_output)
# Level 1: method for `input_class`
# Level 2: method for `input_class/in:v1`
# Level 3: method for `input_class/in:v1/out:v1`
# [1] "a" "b" "c"
# attr(,"class")
# [1] "output_class_v1" "output_class"
Step 3: Handling evolution of output_class
to v2
# Method for `output_class` of version `v2` for inputs
# of `input_class` of version `v1`
new_output_class__input_class__v1.v2 <- function(
x,
schema_version_in,
schema_version_out
) {
message(stringr::str_glue("Level 3: method for `input_class/in:{schema_version_in}/out:{schema_version_out}`"))
structure(x %>% rep(2),
class = c(stringr::str_glue("output_class_{schema_version_out}"),
"output_class")
)
}
schema_version_of_new_output <- structure("v2", class = "v2")
new_output_class(x, schema_version_of_x, schema_version_of_new_output)
# Level 1: method for `input_class`
# Level 2: method for `input_class/in:v1`
# Level 3: method for `input_class/in:v1/out:v2`
# [1] "a" "b" "c" "a" "b" "c"
# attr(,"class")
# [1] "output_class_v2" "output_class"
Step 4: Handling evolution of input_class
to v2
# Generic for input `input_class` of version `v2`
new_output_class__input_class__v2 <- function(x, schema_version_in,
schema_version_out, ...) {
UseMethod("new_output_class__input_class__v2", schema_version_out)
}
# Method for `input_class` of version `v2`
new_output_class__input_class.v2 <- function(
x,
schema_version_in,
schema_version_out
) {
message(stringr::str_glue("Level 2: method for `input_class/in:{schema_version_in}`"))
new_output_class__input_class__v2(x, schema_version_in, schema_version_out)
}
# Method for `output_class` of version `v1` for inputs
# of `input_class` of version `v2`
new_output_class__input_class__v2.v1 <- function(
x,
schema_version_in,
schema_version_out
) {
message(stringr::str_glue("Level 3: method for `input_class/in:{schema_version_in}/out:{schema_version_out}`"))
structure(x %>% tolower(),
class = c(stringr::str_glue("output_class_{schema_version_out}"),
"output_class")
)
}
x <- structure(LETTERS[1:3], class = "input_class")
schema_version_of_x <- structure("v2", class = "v2")
schema_version_of_new_output <- structure("v1", class = "v1")
new_output_class(x, schema_version_of_x, schema_version_of_new_output)
# Level 1: method for `input_class`
# Level 2: method for `input_class/in:v2`
# Level 3: method for `input_class/in:v2/out:v1`
# [1] "a" "b" "c"
# attr(,"class")
# [1] "output_class_v1" "output_class"
# Method for `output_class` of version `v2` for inputs
# of `input_class` of version `v2`
new_output_class__input_class__v2.v2 <- function(
x,
schema_version_in,
schema_version_out
) {
message(stringr::str_glue("Level 3: method for `input_class/in:{schema_version_in}/out:{schema_version_out}`"))
structure(x %>% tolower() %>% rep(2),
class = c(stringr::str_glue("output_class_{schema_version_out}"),
"output_class")
)
}
schema_version_of_x <- structure("v2", class = "v2")
schema_version_of_new_output <- structure("v2", class = "v2")
new_output_class(x, schema_version_of_x, schema_version_of_new_output)
# Level 1: method for `input_class`
# Level 2: method for `input_class/in:v2`
# Level 3: method for `input_class/in:v2/out:v2`
# [1] "a" "b" "c" "a" "b" "c"
# attr(,"class")
# [1] "output_class_v2" "output_class"
Pros
- It works
- It's S3
- It's systematic
Cons
- It's even more verbose than any S4-based solution
- I don't like it at all
Did anyone come up with a better emulation of multi dispatch in S3 yet?
来源:https://stackoverflow.com/questions/57903831/emulating-multiple-dispatch-in-s3-for-3-signature-arguments