问题
On a jupyter notebook with Tensorflow-2.0.0, a train-validation-test split of 80-10-10 was performed in this way:
import tensorflow_datasets as tfds
from os import getcwd
splits = tfds.Split.ALL.subsplit(weighted=(80, 10, 10))
filePath = f"{getcwd()}/../tmp2/"
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True, split=splits, data_dir=filePath)
However, when trying to run the same code locally I get the error
AttributeError: type object 'Split' has no attribute 'ALL'
I have seen I can create two sets in this way:
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True, split=['train[:80]','test[80:90]'], data_dir=filePath)
but I do not know how I can add a third set.
回答1:
tfds.Split.ALL.subsplit
or tfds.Split.TRAIN.subsplit
apparently are deprecated and no longer supported.
Some of the datasets are already split between train and test. In this case I found the following solution (using for example the fashion MNIST dataset):
splits, info = tfds.load('fashion_mnist', with_info=True, as_supervised=True,
split=['train+test[:80]','train+test[80:90]', 'train+test[90:]'],
data_dir=filePath)
(train_examples, validation_examples, test_examples) = splits
来源:https://stackoverflow.com/questions/64451516/how-to-split-a-tensorflow-dataset-into-train-test-and-validation-in-a-python-sc