问题
I'd like to train a convolutional network to solve a multi-class, multi-label problem on image data. Due to the nature of the data, and for reasons I'll spare you, it would be best if I could use a custom R generator function to feed to the fit_generator
command, instead of its built-in image_data_generator
and flow_images_from_directory
commands (which I was successfully able to get working, just not for this particular problem).
Here (https://www.rdocumentation.org/packages/keras/versions/2.2.0/topics/fit_generator) it says that I can do just that, without giving any examples. So I tried the following. Here is an extremely stripped down example of what I'm trying to do (this code is entirely self contained):
library(keras)
library(reticulate) #for py_iterator function
play.network = keras_model_sequential() %>%
layer_dense(units = 10, activation = "relu", input_shape = c(10)) %>%
layer_dense(units = 1, activation = "relu")
play.network %>% compile(
optimizer = "rmsprop",
loss = "mse"
)
mikes.custom.generator.function = function() #generates a 2-list of a random 1 x 10 array, and a scalar
{
new.func = function()
{
arr = array(dim = c(1,10))
arr[,] = sample(1:10, 10, replace = TRUE)/10
return(list(arr,runif(1)))
}
}
mikes.custom.iterator = py_iterator(mikes.custom.generator.function()) #creates a python iterator object
generator_next(mikes.custom.iterator) #correctly returns a 2-member list consisting of a 1 x 10 array, and a scalar
generator_next(mikes.custom.iterator)[[1]] #a 1 x 10 array
generator_next(mikes.custom.iterator)[[2]] #a scalar
#try to fit with "fit_generator":
play.network %>% fit_generator( #FREEZES.
mikes.custom.iterator,
steps_per_epoch = 1,
epochs = 1
)
The thing freezes at training time, without giving me an error message or anything. I also tried it with a custom image data generator for my original problem, same result.
Note that this network trains just fine if I just use fit
and input the training data manually:
play.network %>% fit(generator_next(mikes.custom.iterator)[[1]],generator_next(mikes.custom.iterator)[[2]], epochs = 1, batch_size = 1)
#trains just fine
I think I know the problem, but I don't know the solution. If you ask it for the class of my custom iterator, it gives
class(mikes.custom.iterator)
[1] "python.builtin.iterator" "rpytools.generator.RGenerator" "python.builtin.object"
whereas if I build an iterator using the builtin image_data_generator
and flow_images_from_directory
commands, it gives
train_datagen <- image_data_generator(rescale = 1/255)
class(train_datagen)
[1] "keras.preprocessing.image.ImageDataGenerator" "keras_preprocessing.image.ImageDataGenerator" "python.builtin.object"
train_generator <- flow_images_from_directory(
train_dir,
train_datagen,
....
)
class(train_generator)
[1] "python.builtin.iterator" "keras_preprocessing.image.DirectoryIterator" "keras_preprocessing.image.Iterator" "tensorflow.python.keras.utils.data_utils.Sequence" "python.builtin.object"
So my guess is that train_datagen
and/or train_generator
have attributes that mikes.custom.iterator
does not, and fit_generator
is trying to call upon mikes.custom.iterator
using functions other than the basic generator_next
(which is in theory all it should really need). But I don't know what they may be, or how to build mikes.custom.iterator
correctly, even after searching for two hours online.
Help anyone?
回答1:
sampling_generator <- function(X_data, Y_data, batch_size) {
function() {
rows <- sample(1:nrow(X_data), batch_size, replace = TRUE)
list(X_data[rows,], Y_data[rows,])
}
}
model %>%
fit_generator(sampling_generator(X_train, Y_train, batch_size = 128),
steps_per_epoch = nrow(X_train) / 128, epochs = 10)
I found this answer in R keras FAQs which seems to work
https://keras.rstudio.com/articles/faq.html#how-can-i-use-keras-with-datasets-that-dont-fit-in-memory
回答2:
In R, you can build an iterator using <<-
operator. This is very helpful to build a custom generator function; and it is compatible with Keras' fit_generator()
function.
Some minimal example:
# example data
data <- data.frame(
x = runif(80),
y = runif(80),
z = runif(80)
)
# example generator
data_generator <- function(data, x, y, batch_size) {
# start iterator
i <- 1
# return an iterator function
function() {
# reset iterator if already seen all data
if ((i + batch_size - 1) > nrow(data)) i <<- 1
# iterate current batch's rows
rows <- c(i:min(i + batch_size - 1, nrow(data)))
# update to next iteration
i <<- i + batch_size
# create container arrays
x_array <- array(0, dim = c(length(rows), length(x)))
y_array <- array(0, dim = c(length(rows), length(y)))
# fill the container
x_array[1:length(rows), ] <- data[rows, x]
y_array[1:length(rows), ] <- data[rows, y]
# return the batch
list(x_array, y_array)
}
}
# set-up a generator
gen <- data_generator(
data = data.matrix(data),
x = 1:2, # it is flexible, you can use the column numbers,
y = c("y", "z"), # or the column name
batch_size = 32
)
From above function, you can simply check the resulting arrays by calling the generator:
gen()
Or you could also test the generator using a simple Keras model:
# import keras
library(keras)
# set up a simple keras model
model <- keras_model_sequential() %>%
layer_dense(32, input_shape = c(2)) %>%
layer_dense(2)
model %>% compile(
optimizer = "rmsprop",
loss = "mse"
)
# fit using generator
model %>% fit_generator(
generator = gen,
steps_per_epoch = 100, # will auto-reset after see all sample
epochs = 10
)
I have to admit that the process is a little bit complex and requires extensive programming. You should check this featured blog post by François Chollet himself, or kerasgenerator package that I develop personally.
来源:https://stackoverflow.com/questions/53357901/using-a-custom-r-generator-function-with-fit-generator-keras-r