AWS Lambda not importing LXML

前端 未结 8 1374
傲寒
傲寒 2021-01-02 04:57

I am trying to use the LXML module within AWS Lambda and having no luck. I downloaded LXML using the following command:

pip install lxml -t folder

相关标签:
8条回答
  • 2021-01-02 04:59

    Extending on these answers, I found the following to work well.

    The punchline here is having python compile lxml with static libs, and installing in the current directory rather than site-packages.

    It also means you can write your python code as usual, without need for a distinct worker.py or fiddling with LD_LIBRARY_PATH

    sudo yum groupinstall 'Development Tools'
    sudo yum -y install python36-devel python36-pip
    sudo ln -s /usr/bin/pip-3.6 /usr/bin/pip3
    mkdir lambda && cd lambda
    STATIC_DEPS=true pip3 install -t . lxml
    zip -r ~/deps.zip *
    

    to take it to the next level, use serverless and docker to handle everything. here is a blog post demonstrating this: https://serverless.com/blog/serverless-python-packaging/

    0 讨论(0)
  • 2021-01-02 05:00

    I have solved this using the serverless framework and its built-in Docker feature.

    Requirement: You have an AWS profile in your .aws folder that can be accessed.

    First, install the serverless framework as described here. You can then create a configuration file using the command serverless create --template aws-python3 --name my-lambda. It will create a serverless.yml file and a handler.py with a simple "hello" function. You can check if that works with a sls deploy. If that works, serverless is ready to be worked with.

    Next, we'll need an additional plugin named "serverless-python-requirements" for bundling Python requirements. You can install it via sls plugin install --name serverless-python-requirements.

    This plugin is where all the magic happens that we need to solve the missing lxml package. In the custom->pythonRequirements section you simply have to add the dockerizePip: non-linux property. Your serverless.yml file could look something like this:

    service: producthunt-crawler
    
    provider:
      name: aws
      runtime: python3.8
    
    functions:
      hello:
        # some handler that imports lxml
        handler: handler.hello
    
    plugins:
      - serverless-python-requirements
    
    custom:
      pythonRequirements:
        fileName: requirements.txt
        dockerizePip: non-linux
    
        # Omits tests, __pycache__, *.pyc etc from dependencies
        slim: true
    

    This will run the bundling of python requirements inside a pre-configured docker container. After this, you can run sls deploy to see the magic happen and then sls invoke -f my_function to check that it works.

    When you've used serverless to deploy and add the dockerizePip: non-linux option later, make sure to clean up your already built requirements with sls requirements clean. Otherwise, it just uses the already built stuff.

    0 讨论(0)
  • 2021-01-02 05:03

    LXML is very sensitive with its running environment.

    I fixed this issue by building the zip Lambda package in a python:3.x-slim container :

    pip install --target=. lxml
    zip -r lambda.zip lambda.py lxml
    

    Image container version must be the same that the python engine version used in Lambda

    Tested successfully with python 3.6, 3.7 and 3.8

    0 讨论(0)
  • 2021-01-02 05:08

    AWS Lambda use a special version of Linux (as far as I can see).

    Using "pip install a_package -t folder" is the good thing to do usually as it will help to package your dependencies within the archive that will be sent to Lambda, but the libraries, and especially the binary libraries have to be compatible with the version of OS and Python on lambda.

    You could use the xml module included in Python : https://docs.python.org/2/library/xml.etree.elementtree.html

    If you really need lxml, this link gives some tricks on how to compile shared libraries for Lambda : http://www.perrygeo.com/running-python-with-compiled-code-on-aws-lambda.html

    0 讨论(0)
  • 2021-01-02 05:14

    I faced the same issue.

    The link posted by Raphaël Braud was helpful and so was this one: https://nervous.io/python/aws/lambda/2016/02/17/scipy-pandas-lambda/

    Using the two links I was able to successfully import lxml and other required packages. Here are the steps I followed:

    • Launch an ec2 machine with Amazon Linux ami
    • Run the following script to accumulate dependencies:

      set -e -o pipefail
      sudo yum -y upgrade
      sudo yum -y install gcc python-devel libxml2-devel libxslt-devel
      
      virtualenv ~/env && cd ~/env && source bin/activate
      pip install lxml
      for dir in lib64/python2.7/site-packages \
           lib/python2.7/site-packages
      do
      if [ -d $dir ] ; then
         pushd $dir; zip -r ~/deps.zip .; popd
      fi
      done  
      mkdir -p local/lib
      cp /usr/lib64/ #list of required .so files
      local/lib/
      zip -r ~/deps.zip local/lib
      
    • Create handler and worker files as specified in the link. Sample file contents:

    handler.py

    import os
    import subprocess
    
    
    libdir = os.path.join(os.getcwd(), 'local', 'lib')
    
    def handler(event, context):
        command = 'LD_LIBRARY_PATH={} python worker.py '.format(libdir)
        output = subprocess.check_output(command, shell=True)
    
        print output
    
        return
    

    worker.py:

    import lxml
    
    def sample_function( input_string = None):
        return "lxml import successful!"
    
    if __name__ == "__main__":
        result = sample_function()
        print result
    
    • Add handler and worker to zip file.

    Here is how the structure of the zip file looks after the above steps:

    deps 
    ├── handler.py
    ├── worker.py 
    ├── local
    │   └── lib
    │       ├── libanl.so
    │       ├── libBrokenLocale.so
    |       ....
    ├── lxml
    │   ├── builder.py
    │   ├── builder.pyc
    |       ....
    ├── <other python packages>
    
    • Make sure you specify the correct handler name while creating the lambda function. In the above example, it would be- "handler.handler"

    Hope this helps!

    0 讨论(0)
  • 2021-01-02 05:18

    I was able to get this working by following the readme on this page:

    1. With docker installed, run this command (replacing python3.8 with the version of python you are using for your lambda function, and lxml with the version of lxml you want to use)
      $ docker run -v $(pwd):/outputs -it lambci/lambda:build-python3.8 \
            pip install lxml -t /outputs/
      
    2. This will create a folder called lxml in your working directory, and possibly some other folders which you can ignore. Move the lxml folder to the same directory as the .py file you are using as your lambda handler.
    3. Zip up the .py file with the lxml folder, as well as any packages if you are using a virtualenv. I had a virtualenv and lxml already existed in my site-packages folder, so I had to delete it first. Here are the commands I ran (note that my virtualenv v-env folder was in the same directory as my .py file):
      FUNCTION_NAME="name_of_your_python_file"
      cd v-env/lib/python3.8/site-packages &&
      rm -rf lxml &&
      rm -rf lxml-4.5.1.dist-info &&
      zip -r9 ${OLDPWD}/${FUNCTION_NAME}.zip . &&
      cd ${OLDPWD} &&
      zip -g ${FUNCTION_NAME}.zip ${FUNCTION_NAME}.py && 
      zip -r9 ${FUNCTION_NAME}.zip lxml
      
    4. If you don't have a virtualenv or any other dependencies, you can just run
      FUNCTION_NAME="name_of_your_python_file"
      zip -g ${FUNCTION_NAME}.zip ${FUNCTION_NAME}.py && 
      zip -r9 ${FUNCTION_NAME}.zip lxml
      
    5. Upload ${FUNCTION_NAME}.zip to your lambda function and use as normal.

    More on creating a .zip file for lambda with a virtualenv here

    0 讨论(0)
提交回复
热议问题