问题
I have a simple RegisterModel.py script that uses the Azure ML Service SDK to register a fastText .bin model. This completes successfully and I can see the model in the Azure Portal UI (I cannot see what model files are in it). I then want to download the model (DownloadModel.py) and use it (for testing purposes), however it throws an error on the model.download method (tarfile.ReadError: file could not be opened successfully) and makes a 0 byte rjtestmodel8.tar.gz file.
I then use the Azure Portal and Add Model and select the same bin model file and it uploads fine. Downloading it with the download.py script below works fine, so I am assuming something is not correct with the Register script.
Here are the 2 scripts and the stacktrace - let me know if you can see anything wrong:
RegisterModel.py
import azureml.core
from azureml.core import Workspace, Model
ws = Workspace.from_config()
model = Model.register(workspace=ws,
model_name='rjSDKmodel10',
model_path='riskModel.bin')
DownloadModel.py
# Works when downloading the UI Uploaded .bin file, but not the SDK registered .bin file
import os
import azureml.core
from azureml.core import Workspace, Model
ws = Workspace.from_config()
model = Model(workspace=ws, name='rjSDKmodel10')
model.download(target_dir=os.getcwd(), exist_ok=True)
Stacktrace
Traceback (most recent call last):
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\ptvsd_launcher.py", line 43, in <module>
main(ptvsdArgs)
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\lib\python\ptvsd\__main__.py", line 432, in main
run()
File "...\.vscode\extensions\ms-python.python-2019.9.34474\pythonFiles\lib\python\ptvsd\__main__.py", line 316, in run_file
runpy.run_path(target, run_name='__main__')
File "...\.conda\envs\DoC\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "...\.conda\envs\DoC\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "...\.conda\envs\DoC\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "...\\DownloadModel.py", line 21, in <module>
model.download(target_dir=os.getcwd(), exist_ok=True)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 712, in download
file_paths = self._download_model_files(sas_to_relative_download_path, target_dir, exist_ok)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 658, in _download_model_files
file_paths = self._handle_packed_model_file(tar_path, target_dir, exist_ok)
File "...\.conda\envs\DoC\lib\site-packages\azureml\core\model.py", line 670, in _handle_packed_model_file
with tarfile.open(tar_path) as tar:
File "...\.conda\envs\DoC\lib\tarfile.py", line 1578, in open
raise ReadError("file could not be opened successfully")
tarfile.ReadError: file could not be opened successfully
Environment
- riskModel.bin is 6 megs
- AMLS 1.0.60
- Python 3.7
- Working locally with Visual Code
回答1:
The Azure Machine Learning service SDK has a bug with how it interacts with Azure Storage, which causes it to upload corrupted files if it has to retry uploading.
A couple workarounds:
- The bug was introduced in 1.0.60 release. If you downgrade to AzureML-SDK 1.0.55, the code should fail when there are issue uploading instead of silently corrupting data.
- It's possible that the retry is being triggered by the low timeout values that the AzureML-SDK defaults to. You could investigate changing the timeout in
site-packages/azureml/_restclient/artifacts_client.py
This bug should be fixed in the next release of the AzureML-SDK.
来源:https://stackoverflow.com/questions/57854136/registering-and-downloading-a-fasttext-bin-model-fails-with-azure-machine-learn