I want to containerise a pipeline of code that was predominantly developed in Python but has a dependency on a model that was trained in R. There are some additional dependencie
Being specific on both Python and R versions will save you future headaches. This approach, for instance, will always install R v4.0 and Python v3.8
FROM r-base:4.0.3
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y --no-install-recommends build-essential libpq-dev python3.8 python3-pip python3-setuptools python3-dev
RUN pip3 install --upgrade pip
ENV PYTHONPATH "${PYTHONPATH}:/app"
WORKDIR /app
ADD requirements.txt .
ADD requirements.r .
# installing python libraries
RUN pip3 install -r requirements.txt
# installing r libraries
RUN Rscript requirements.r
And your requirements.r file should look like
install.packages('data.table')
install.packages('jsonlite')
...