kedro

How to use tf.data.Dataset with kedro?

烈酒焚心 提交于 2021-02-11 14:58:23
问题 I am using tf.data.Dataset to prepare a streaming dataset which is used to train a tf.kears model. With kedro, is there a way to create a node and return the created tf.data.Dataset to use it in the next training node? The MemoryDataset will probably not work because tf.data.Dataset cannot be pickled ( deepcopy isn't possible), see also this SO question. According to issue #91 the deep copy in MemoryDataset is done to avoid modifying the data by some other node. Can someone please elaborate a

Kedro - how to pass nested parameters directly to node

旧巷老猫 提交于 2021-02-08 03:41:21
问题 kedro recommends storing parameters in conf/base/parameters.yml . Let's assume it looks like this: step_size: 1 model_params: learning_rate: 0.01 test_data_ratio: 0.2 num_train_steps: 10000 And now imagine I have some data_engineering pipeline whose nodes.py has a function that looks something like this: def some_pipeline_step(num_train_steps): """ Takes the parameter `num_train_steps` as argument. """ pass How would I go about and pass that nested parameters straight to this function in data

How to run the nodes in sequence as declared in kedro pipeline?

落花浮王杯 提交于 2019-12-12 17:10:44
问题 In Kedro pipeline, nodes (something like python functions) are declared sequentially. In some cases, the input of one node is the output of the previous node. However, sometimes, when kedro run API is called in the commandline, the nodes are not run sequentially. In kedro documentation, it says that by default the nodes are ran in sequence. My run.py code: def main( tags: Iterable[str] = None, env: str = None, runner: Type[AbstractRunner] = None, node_names: Iterable[str] = None, from_nodes: