Does anyone have a fully compiled version of pandas that is compatible with AWS Lambda?
After searching around for a few hours, I cannot seem to find what I\'m looki
# all the step are done in AWS EC2 Linux Free tier so that all the Libraries are compatible with the Lambda environment
# install the required packages
mkdir packages
pip3 install -t . pandas
pip3 install -t . numpy --upgrade
pip3 install -t . wikipedia --upgrade
pip3 install -t . sklearn --upgrade
pip3 install -t . pickle-mixin --upgrade
pip3 install -t . fuzzywuzzy --upgrade
# Now remove all unnecessary files
sudo rm -r *.whl *.dist-info __pycache__
# Now make a DIR so that lambda function can reconginzes
sudo mkdir -p build/python/lib/python3.6/site-packages
# Now move all the files from packages folder to site-packages folder
sudo mv /home/ec2-user/packages/* build/python/lib/python3.6/site-packages/
# Now move to the build packages
cd build
# Now zip all the files starting from python folder to site-packages
sudo zip -r python.zip .
upload the zip file to lambda layers
@ashtonium's answer actually works and is most likely the easiest, however, a few additional steps are required. Also, Pandas requires Pytz (mentioned in the link provided by @b3rt0) so that package is needed as well.
unzip filename.whl
(Linux/MacOS)python/lib/python3.7/site-packages/
(swap 3.7 for version of your choice)python
This is a very common question, I hope my solution helps.
Update on Aug 19, 2020:
Wheel-files aren't available for all packages. In these cases you can skip to step 3, go into the site-packages folder and install the package in there with pip3 install PACKAGE_NAME -t .
(no venv required). Some packages are easier than others, some are trickier. Psycopg2 for example, requires you to move only one of the two (as of this writing) package folders.
/Cheers
I believe you should be able to use the recent pandas version (or likely, the one on your machine). You can create a lambda package with pandas by yourself like this,
First find where the pandas package is installed on your machine i.e. Open a python terminal and type
import pandas
pandas.__file__
That should print something like '/usr/local/lib/python3.4/site-packages/pandas/__init__.py'
'/usr/local/lib/python3.4/site-packages/pandas
) and place it in your repository.Package your Lambda code with pandas like this:
zip -r9 my_lambda.zip pandas/
zip -9 my_lambda.zip my_lambda_function.py
You can also deploy your code to S3 and make your Lambda use the code from S3.
aws s3 cp my_lambda.zip s3://dev-code//projectx/lambda_packages/
Here's the repo that will get you started
I know the question was asked a couple years ago and Lambda was on a different stage back then.
I faced similar issues lately and I thought it would be a good idea to add the newest solution here for future users facing the same problem.
It turns out that amazon released the concept of layers in the re:Invent 2018. It is a great feature. This post in medium describes it much better than I could here: Creating New AWS Lambda Layer For Python Pandas Library
The easiest way to get pandas working in a Lambda function is to utilize Lambda Layers and AWS Data Wrangler. A Lambda Layer is a zip archive that contains libraries or dependencies. According to the AWS documentation, using layers keeps your deployment package small, making development easier.
The AWS Data Wrangler is an open source package that extends the power of pandas to AWS services.
Follow the instructions (under AWS Lambda Layer) here.