问题
Trying to submit a Google Cloud ML Training job for Tensorflow Object Detection task and I am following the official guideline
Following is the job that I am submitting:
export CONFIG=trainer/cloud.yaml
export TRAIN_DIR=kt-1000/training
export PIPELINE_CONFIG=kt-1000/training/ssd_mobilenet_v1_pets.config
gcloud ml-engine jobs submit training object_detection_`date +%s` \
--job-dir=gs://${TRAIN_DIR} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim
0.1.tar.gz \
--module-name object_detection.train \
--region asia-east1-a \
--config ${CONFIG} \
-- \
--train_dir=gs://${TRAIN_DIR} \
--pipeline_config_path=gs://${PIPELINE_CONFIG}
I am getting the following error message:
ERROR: (gcloud.ml-engine.jobs.submit.training) unrecognized arguments:
The Error message however does not point out which argument/s is unrecognised though!!!!
Any help on this will be truly appreciated
Thanks,
Devjothi
回答1:
You just have to remove any space before --
, like this:
gcloud ml-engine jobs submit training $JOB_NAME \
--job-dir $OUTPUT_PATH \
--runtime-version 1.10 \
--python-version 3.5 \
--module-name trainer.task \
--package-path trainer/ \
--region $REGION \
-- \
--train-files $TRAIN_DATA \
--eval-files $EVAL_DATA \
--train-steps 1000 \
--eval-steps 100 \
--verbosity DEBUG
回答2:
I met the problems as well with the official guideline on my windows machine.
1.Check what is the unrecognized arguments.
2.Beware of between --config
and --train_dir
there is an empty --
(that was my error came from)
ps: on windows there is no date +%s
. Therefore, I replace by my own JOB_NAME.
回答3:
Try without the job-dir. You don't need to specify job-dir. ML Engine will pass in a job-dir when it invokes your job
回答4:
in my case, i added the following
import gcsfs
then the $ variables are recognized.
来源:https://stackoverflow.com/questions/47000556/error-message-while-submitting-google-cloud-ml-training-job-for-tensorflow-objec