ValueError: non-string names in Numpy dtype unpickling only on AWS Lambda

半世苍凉 提交于 2020-12-07 04:48:29

问题


I am using pickle to save my trained ML model. For the learning part, I am using scikit-learn library and building a RandomForestClassifier

rf =  RandomForestClassifier(n_estimators=100,  max_depth=20, 
min_samples_split=2,  max_features='auto', oob_score=True, 
random_state=123456)
rf.fit(X, y)

fp = open('model.pckl', 'wb')
pickle.dump(rf, fp, protocol=2)
fp.close()

I uploaded this model on S3 and I am fetching this model using boto3 library in AWS Lambda.

s3_client = boto3.client('s3')

bucket = 'mlbucket'
key = 'model.pckl'
download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)
s3_client.download_file(bucket, key, download_path)
f = open(download_path, 'rb')
model = pickle.load(f)
f.close()

However, I am getting ValueError: non-string names in Numpy dtype unpickling error at this line: model = pickle.load(f)

Here's the log:

START RequestId: 3d8a1263-1e3c-11e8-8bdb-03c0ef524c0e Version: $LATEST
non-string names in Numpy dtype unpickling: ValueError
Traceback (most recent call last):
File "/var/task/function.py", line 31, in handler
  model = pickle.load(f)
File "/usr/lib64/python2.7/pickle.py", line 1384, in load
  return Unpickler(file).load()
File "/usr/lib64/python2.7/pickle.py", line 864, in load
  dispatch[key](self)
File "/usr/lib64/python2.7/pickle.py", line 1223, in load_build
  setstate(state)
ValueError: non-string names in Numpy dtype unpickling

I am using python 2.7 on both local machine as well as AWS Lambda. The weird part is that the pickle.load() is working fine on my local machine. I have used this code to test pickle on my local machine:

with open('/home/Documents/model.pckl', 'rb') as f:
    rf = pickle.load(f)

回答1:


I found out that the problem was with the library version mismatch.

The libraries that I uploaded on AWS Lambda after zipping (numpy, scipy, etc.) were of the latest version, whereas the libraries on my local machine were older. As soon as I updated the libraries on my local machine, built the pickle objects and updated them on S3, lambda started working fine.

So, it turns out that the versions of not only python, but also the libraries do matter when pickling objects.



来源:https://stackoverflow.com/questions/49075045/valueerror-non-string-names-in-numpy-dtype-unpickling-only-on-aws-lambda

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!