问题
I'm following along with Google's object detection on a TPU post and have hit a wall when it comes to training.
Looking at the job logs, I can see that ml-engine runs a ton of pip installs for various packages, provisions a TPU, and then submits the following:
Running command: python -m object_detection.model_tpu_main
--model_dir=gs://{MY_BUCKET}/train --tpu_zone us-central1
--pipeline_config_path=gs://{MY_BUCKET}/data/pipeline.config
--job-dir gs://{MY_BUCKET}/train
It then errors with:
message: "Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/root/.local/lib/python2.7/site-packages/object_detection/model_tpu_main.py", line 30, in <module>
from object_detection import model_lib
File "/root/.local/lib/python2.7/site-packages/object_detection/model_lib.py", line 26, in <module>
from object_detection import eval_util
File "/root/.local/lib/python2.7/site-packages/object_detection/eval_util.py", line 28, in <module>
from object_detection.metrics import coco_evaluation
File "/root/.local/lib/python2.7/site-packages/object_detection/metrics/coco_evaluation.py", line 20, in <module>
from object_detection.metrics import coco_tools
File "/root/.local/lib/python2.7/site-packages/object_detection/metrics/coco_tools.py", line 47, in <module>
from pycocotools import coco
File "/root/.local/lib/python2.7/site-packages/pycocotools/coco.py",
line 49
import matplotlibnmatplotlib.use('Agg')nimport matplotlib.pyplot as plt
^
SyntaxError: invalid syntax
"
This is my first time using ml-engine and I'm stuck. I find it odd that the error references python2.7, as I submitted the job from my laptop in a python3.6 environment.
Any ideas on where to go from here or what to do?
回答1:
Based on the stack trace, three different lines of code somehow fell on the same line (line 49). I believe I've encountered the same problem recently playing with the new Tensorflow object detection API, and the problem was in models/research/object_detection/dataset_tools/create_pycocotools_package.sh
, specifically the following line:
sed "s/import matplotlib\.pyplot as plt/import matplotlib\nmatplotlib\.use\(\'Agg\'\)\nimport matplotlib\.pyplot as plt/g" pycocotools/coco.py > coco.py.updated
The problem for me was that the new-line characters weren't recognized, and I solved it by using literal new lines like the following:
sed "s/import matplotlib\.pyplot as plt/import matplotlib\\
matplotlib\.use\(\'Agg\'\)\\
import matplotlib\.pyplot as plt/g" pycocotools/coco.py > coco.py.updated
Hope this helps.
来源:https://stackoverflow.com/questions/51430391/tensorflow-object-detection-training-error-with-tpu