google-cloud-ml

How to log from a custom ai platform model

喜你入骨 提交于 2020-04-06 14:02:24
问题 I recently deployed a custom model to google cloud's ai-platform, and I am trying to debug some parts of my preprocessing logic. However, My print statements are not being logged to the stackdriver output. I have also tried using the logging client imported from google.cloud, to no avail. Here is my custom prediction file: import os import pickle import numpy as np from sklearn.datasets import load_iris import tensorflow as tf from google.cloud import logging class MyPredictor(object): def _

How to set an environment variable when using gcloud ML engine?

亡梦爱人 提交于 2020-03-04 21:21:19
问题 All, (Environments: Windows 7, Python 3.6, Keras & tensorflow libs, gcloud ml engine) I am running certain Keras ML model examples using gcloud ml engine as introduced here. Everything was fine but I just got various results among multiple runs although I was using the same training and validation data. My goal is to make reproductive training results from multiple runs. I googled it for a while and found some solutions in this Keras Q&A regarding making reproductive results. Basically they

How to set an environment variable when using gcloud ML engine?

百般思念 提交于 2020-03-04 21:17:00
问题 All, (Environments: Windows 7, Python 3.6, Keras & tensorflow libs, gcloud ml engine) I am running certain Keras ML model examples using gcloud ml engine as introduced here. Everything was fine but I just got various results among multiple runs although I was using the same training and validation data. My goal is to make reproductive training results from multiple runs. I googled it for a while and found some solutions in this Keras Q&A regarding making reproductive results. Basically they

unable to specify master_type in MLEngineTrainingOperator

徘徊边缘 提交于 2020-03-02 09:37:36
问题 I am using airflow to schedule a pipeline that will result in training a scikitlearn model with ai platform. I use this DAG to train it with models.DAG(JOB_NAME, schedule_interval=None, default_args=default_args) as dag: # Tasks definition training_op = MLEngineTrainingOperator( task_id='submit_job_for_training', project_id=PROJECT, job_id=job_id, package_uris=[os.path.join(TRAINER_BIN)], training_python_module=TRAINER_MODULE, runtime_version=RUNTIME_VERSION, region='europe-west1', training

How to package vocabulary file for Cloud ML Engine

限于喜欢 提交于 2020-02-25 04:47:50
问题 I have a .txt file which contains a different label on each line. I use this file to create a label index lookup file, for example: label_index = tf.contrib.lookup.index_table_from_file(vocabulary_file = 'labels.txt' I am wondering how I should package the vocabulary file with my cloud ml-engine? The packaging suggestions are explicit in how to set up the .py files but I am not entirely sure where I should put relevant .txt files. Should they just be hosted in a storage bucket (ie. gs://)

How to convert .ckpt to .pb?

送分小仙女□ 提交于 2020-02-24 05:49:10
问题 I am new to deep learning and I want to use a pretrained (EAST) model to serve from the AI Platform Serving, I have these files made available by the developer: model.ckpt-49491.data-00000-of-00001 checkpoint model.ckpt-49491.index model.ckpt-49491.meta I want to convert it into the TensorFlow .pb format. Is there a way to do it? I have taken the model from here The full code is available here. I have looked up here and it shows the following code to convert it: From tensorflow/models

Tensorflow on ML Engine: The replica master 0 exited with a non-zero status of 1

馋奶兔 提交于 2020-02-23 04:07:13
问题 I launch a tensorflow task on ML Engine and after about 2 minutes I keep getting an error message " The replica master 0 exited with a non-zero status of 1. " (The task incidentally runs fine with ml-engine local.) Question: Is there any place or log file where can I see further information on what happened? The logs viewer just gives the following: { insertId: "ibal72g1rxhr63" logName: "projects/**-***-ml/logs/ml.googleapis.com%2Fcnn180322_170649" receiveTimestamp: "2018-03-22T17:08:38

Tensorflow on ML Engine: The replica master 0 exited with a non-zero status of 1

旧城冷巷雨未停 提交于 2020-02-23 04:06:34
问题 I launch a tensorflow task on ML Engine and after about 2 minutes I keep getting an error message " The replica master 0 exited with a non-zero status of 1. " (The task incidentally runs fine with ml-engine local.) Question: Is there any place or log file where can I see further information on what happened? The logs viewer just gives the following: { insertId: "ibal72g1rxhr63" logName: "projects/**-***-ml/logs/ml.googleapis.com%2Fcnn180322_170649" receiveTimestamp: "2018-03-22T17:08:38

GCP ML Engine Prediction failed: Error processing input: Expected float32 got base64

不打扰是莪最后的温柔 提交于 2020-02-06 04:25:26
问题 I am trying to call a prediction on a custom trained TensorFlow model deployed to GCP ML engine. When I am trying to call a prediction on the model it is returning the following error message "Expected float32 got base64" I've used transfer learning and the TensorFlow's retrain.py script to train my model on my images, following the official documentation python retrain.py --image_dir ~/training_images saved_model_dir /saved_model_directory I've tested the prediction locally using TensorFlow

How to get 'keys' in batch predictions with ml-engine using a custom model?

霸气de小男生 提交于 2020-01-16 17:04:08
问题 I have been working on deployment of a custom estimator (tensorflow model). After training on ml-engine everything is Ok, but when use ml-engine predictions in batch model I could not get the key (or any id of the original input) as you know batch predictions is in distributed mode and "keys" helps to understand which predictions correspond. I found this post where solve this problem, but using a pre-made (canned) tensorflow model (census use case). How can adapt my custom model (tf.contrib