How can I create a Docker image to run both Python and R?

前端 未结 3 952
后悔当初
后悔当初 2021-02-14 11:11

I want to containerise a pipeline of code that was predominantly developed in Python but has a dependency on a model that was trained in R. There are some additional dependencie

相关标签:
3条回答
  • 2021-02-14 11:55

    Being specific on both Python and R versions will save you future headaches. This approach, for instance, will always install R v4.0 and Python v3.8

    FROM r-base:4.0.3
    ENV DEBIAN_FRONTEND=noninteractive
    RUN apt-get update && apt-get install -y --no-install-recommends build-essential libpq-dev python3.8 python3-pip python3-setuptools python3-dev
    RUN pip3 install --upgrade pip
    
    ENV PYTHONPATH "${PYTHONPATH}:/app"
    WORKDIR /app
    
    ADD requirements.txt .
    ADD requirements.r .
    
    # installing python libraries
    RUN pip3 install -r requirements.txt
    
    # installing r libraries
    RUN Rscript requirements.r
    

    And your requirements.r file should look like

    install.packages('data.table')
    install.packages('jsonlite')
    ...
    
    0 讨论(0)
  • 2021-02-14 11:57

    The Dockerfile I built for Python and R to run together with their dependencies in this manner is:

    FROM ubuntu:latest
    
    ENV DEBIAN_FRONTEND=noninteractive
    
    RUN apt-get update && apt-get install -y --no-install-recommends build-essential r-base r-cran-randomforest python3.6 python3-pip python3-setuptools python3-dev
    
    WORKDIR /app
    
    COPY requirements.txt /app/requirements.txt
    
    RUN pip3 install -r requirements.txt
    
    RUN Rscript -e "install.packages('data.table')"
    
    COPY . /app
    

    The commands to build the image, run the container (naming it SnakeR here), and execute the code are:

    docker build -t my_image .
    docker run -it --name SnakeR my_image
    docker exec SnakeR /bin/sh -c "python3 test_call_r.py"
    

    I treated it like a Ubuntu OS and built the image as follows:

    • suppress the prompts for choosing your location during the R install;
    • update the apt-get;
    • set installation criteria of:
      • y = yes to user prompts for proceeding (e.g. memory allocation);
      • install only the recommended, not suggested, dependencies;
    • include some essential installation packages for Ubuntu;
    • r-base for the R software;
    • r-cran-randomforest to force the package to be available (unlike the separate install of data.table which didn’t work for randomForest for some reason);
    • python3.6 version of python;
    • python3-pip to allow pip be used to install the requirements;
    • python3-setuptools to somehow help execute the pip installs (?!);
    • python3-dev to execute the JayDeBeApi installation as part of the requirements (that it otherwise confuses is for Python2 not 3);
    • specify the active “working directory” to be the /app location;
    • copy the requirements file that holds the python dependencies (built from the virtual environment of the Python codebase, e.g., with pip freeze);
    • install the Python packages from the requirements file (pip3 for Python3);
    • install the R packages (e.g. just data.table here);
    • copy the directory contents to the specified working directory /app.

    This is replicated from my blog post at https://datascienceunicorn.tumblr.com/post/182297983466/building-a-docker-to-run-python-r

    0 讨论(0)
  • 2021-02-14 12:07

    I made an image for my personal projects, you could use this if you want: https://github.com/dipayan90/docker-python-r

    0 讨论(0)
提交回复
热议问题