GCP run a prediction of a model every day

荒凉一梦 提交于 2021-01-06 07:33:28

问题


I have a .py file containing all the instructions to generate the predictions for some data. Those data are taken from BigQuery and the predictions should be inserted in another BigQuery table. Right now the code is running on a AIPlatform Notebook, but I want to schedule its execution every day, is there any way to do it?

I run into the AIPlatform Jobs, but I can't understand what should my code do and what should be the structure of the code, is there any step-by-step guide to follow?


回答1:


You can schedule a Notebook execution using different options:

  • nbconvert Different variants of the same technology:

    • nbconvert: Provides a convenient way to execute the input cells of an .ipynb notebook file and save the results, both input and output cells, as a .ipynb file.
    • papermill: is a Python package for parameterizing and executing Jupyter Notebooks. (Uses nbconvert --execute under the hood.)
    • notebook executor: This tool that can be used to schedule the execution of Jupyter notebooks from anywhere (local, GCE, GCP Notebooks) to the Cloud AI Deep Learning VM. You can read more about the usage of this tool here. (Uses gcloud sdk and papermill under the hood)
  • KubeFlow Fairing Is a Python package that makes it easy to train and deploy ML models on Kubeflow. Kubeflow Fairing can also be extended to train or deploy on other platforms. Currently, Kubeflow Fairing has been extended to train on Google AI Platform.

  • AI Platform Notebook Scheduler There are two core functions of the Scheduler extension: Ability to submit a Notebook to run on AI Platform’s Machine Learning Engine as a training job with a custom container image. This allows you to experiment and write your training code in a cost-effective single VM environment, but scale out to an AI Platform job to take advantage of superior resources (ie. GPUs, TPUs, etc.). Scheduling a Notebook for recurring runs follows the exact same sequence of steps, but requires a crontab-formatted schedule option.

  • Nova Plugin: This is the predecessor of the Notebook Scheduler project. Allows you to execute notebooks directly from your Jupyter UI.

  • Notebook training Python package allows users to run a Jupyter notebook at Google Cloud AI Platform Training Jobs.

  • GCP runner: Allows running any Jupyter notebook function on Google Cloud Platform Unlike all other solutions listed above, it allows to run training for the whole project, not single Python file or Jupyter notebook Allows running any function with parameters, moving from local execution to cloud is just a matter of wrapping function in a: gcp_runner.run_cloud(<function_name>, …) call. This project is production-ready without any modifications Supports execution on local (for testing purposes), AI Platform, and Kubernetes environments Full end to end example can be found here: https://www.github.com/vlasenkoalexey/criteo_nbdev

  • tensorflow_cloud (Keras for GCP) Provides APIs that will allow to easily go from debugging and training your Keras and TensorFlow code in a local environment to distributed training in the cloud.

The recommended option in GCP is Notebook Scheduler which is already available in EAP.



来源:https://stackoverflow.com/questions/62642407/gcp-run-a-prediction-of-a-model-every-day

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!