Automating the google cloud AutoML pipeline?

问题

I am trying to use the Google AutoML natural language for training the custom text multi-label classifier. Manually, I could do the dataset creation, import data from Google Storage and train the model. However, I want to automate this entire process.

My current approach is as follows:

I have created google storage bucket to store the annotated data
I spin up the Cloud Function to create the dataset, import data into the dataset and train the model.

However, importing data into dataset, takes much more than 9 minutes which is the maximum timeout for the Google Cloud Function and training the model stage is never reached.

One of the solution could be to spin up another Cloud Function after the trigger is fired when the data import is complete to start training the model. I checked the documentation of Google AutoML and there seems to be no way to achieve that.

Is there any other way to achieve this?

Your help is much appreciated. Thank you.

回答1:

Another way to achieve this, in a serverless way and to be able to run a long process is to use AppEngine.

Deploy an AppEngine standard with your code that you can trigger on an endpoint
Set up a Cloud Scheduler that trigger your appengine endpoint. Set the attemptDeadline to the max timeout that you want (can't exceed 24H)
If your trigger is fired by event, plug a function to the event and create a Cloud Task on the endpoint to trigger with the dispatchDeadline to the max timeout that you want (can't exceed 24H)

来源：https://stackoverflow.com/questions/57910005/automating-the-google-cloud-automl-pipeline

标签

google-cloud-functions

google-cloud-automl