gcp-ai-platform-training

Unexpected error when loading the model: problem in predictor - ModuleNotFoundError: No module named 'torchvision'

一世执手 提交于 2021-02-16 21:30:45
问题 I've been trying to deploy my model to the AI platform for Prediction through the console on my vm instance, but I've gotten the error "(gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: "Failed to load model: Unexpected error when loading the model: problem in predictor - ModuleNotFoundError: No module named 'torchvision' (Error code: 0)" I need to include both torch and torchvision . I followed the steps in this question Cannot deploy trained

Cloud AI Platform Training Fails to Read from Bucket

自闭症网瘾萝莉.ら 提交于 2021-01-29 20:22:14
问题 I'm trying to use Cloud AI Platform for training (gcloud ai-platform jobs submit training). I created my bucket and am sure the training file is there (gsutil ls gs://sat3_0_bucket/data/train_input.csv). However, my job is failing with log messsage: File "/root/.local/lib/python3.7/site-packages/ktrain/text/data.py", line 175, in texts_from_csv with open(train_filepath, 'rb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'gs://sat3_0_bucket/data/train_input.csv' Am I missing

How to start AI-Platform jobs automatically?

和自甴很熟 提交于 2021-01-01 09:38:26
问题 I created a training job where I fetch my data from big query, perform training and deploy model. I would like to start training automatically in this two cases: More than 1000 new rows added to the dataset With a schedule (Ex, once a week) I checked GCP Cloud Scheduler, but it seems its not suitable for my case. 回答1: Cloud Scheduler is the right tool to trigger your training on a schedule. I don't know what your blocker is!! For your first point, you can't. You can't put a trigger (on

Cannot deploy trained model to Google Cloud Ai-Platform with custom prediction routine: Model requires more memory than allowed

匆匆过客 提交于 2020-07-05 04:44:06
问题 I am trying to deploy a pretrained pytorch model to AI Platform with a custom prediction routine. After following the instructions described here the deployment fails with the following error: ERROR: (gcloud.beta.ai-platform.versions.create) Create Version failed. Bad model detected with error: Model requires more memory than allowed. Please try to decrease the model size and re-deploy. If you continue to have error, please contact Cloud ML. The contents of the model folder are 83.89 MB large