问题
I'm trying to use Cloud AI Platform for training (gcloud ai-platform jobs submit training). I created my bucket and am sure the training file is there (gsutil ls gs://sat3_0_bucket/data/train_input.csv).
However, my job is failing with log messsage:
File "/root/.local/lib/python3.7/site-packages/ktrain/text/data.py", line 175, in texts_from_csv
with open(train_filepath, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'gs://sat3_0_bucket/data/train_input.csv'
Am I missing something?
回答1:
The error is probably happening because ktrain tries to auto-detect the character encoding using open(train_filepath, 'rb')
which may be problematic with Google Cloud Storage. One solution is to explicitly provide the encoding
to texts_from_csv
as an argument so this step is skipped (default is None, which means auto-detect).
Alternatively, you can read the data in yourself as a pandas DataFrame using one of these methods. For instance, pandas evidently supports GCS, so you can simply do this: df = pd.read_csv('gs://bucket/your_path.csv')
Then, using ktrain, you can use ktrain.text.texts_from_df
(or ktrain.text.texts_from_array
) to load and preprocess your data.
来源:https://stackoverflow.com/questions/62460368/cloud-ai-platform-training-fails-to-read-from-bucket