ValueError: non-string names in Numpy dtype unpickling only on AWS Lambda

丶灬走出姿态 提交于 2020-12-07 04:46:13


I am using pickle to save my trained ML model. For the learning part, I am using scikit-learn library and building a RandomForestClassifier

rf =  RandomForestClassifier(n_estimators=100,  max_depth=20, 
min_samples_split=2,  max_features='auto', oob_score=True, 
random_state=123456), y)

fp = open('model.pckl', 'wb')
pickle.dump(rf, fp, protocol=2)

I uploaded this model on S3 and I am fetching this model using boto3 library in AWS Lambda.

s3_client = boto3.client('s3')

bucket = 'mlbucket'
key = 'model.pckl'
download_path = '/tmp/{}{}'.format(uuid.uuid4(), key)
s3_client.download_file(bucket, key, download_path)
f = open(download_path, 'rb')
model = pickle.load(f)

However, I am getting ValueError: non-string names in Numpy dtype unpickling error at this line: model = pickle.load(f)

Here's the log:

START RequestId: 3d8a1263-1e3c-11e8-8bdb-03c0ef524c0e Version: $LATEST
non-string names in Numpy dtype unpickling: ValueError
Traceback (most recent call last):
File "/var/task/", line 31, in handler
  model = pickle.load(f)
File "/usr/lib64/python2.7/", line 1384, in load
  return Unpickler(file).load()
File "/usr/lib64/python2.7/", line 864, in load
File "/usr/lib64/python2.7/", line 1223, in load_build
ValueError: non-string names in Numpy dtype unpickling

I am using python 2.7 on both local machine as well as AWS Lambda. The weird part is that the pickle.load() is working fine on my local machine. I have used this code to test pickle on my local machine:

with open('/home/Documents/model.pckl', 'rb') as f:
    rf = pickle.load(f)


I found out that the problem was with the library version mismatch.

The libraries that I uploaded on AWS Lambda after zipping (numpy, scipy, etc.) were of the latest version, whereas the libraries on my local machine were older. As soon as I updated the libraries on my local machine, built the pickle objects and updated them on S3, lambda started working fine.

So, it turns out that the versions of not only python, but also the libraries do matter when pickling objects.

