I\'m trying to setup a simple docker image (I\'m quite new to docker, so please correct my possible misconceptions) based on the public continuum
Using the docker ENV
instruction it is possible to add the virtual environment path persistently to PATH
. Although this does not solve the selected environment listed under conda env list
.
See the MWE:
FROM continuumio/anaconda3:latest
# update conda and setup environment
RUN conda update conda -y \
&& conda create -y -n testenv pip
ENV PATH /opt/conda/envs/testenv/bin:$PATH
RUN echo $PATH
RUN conda env list
EDIT: I have developed a new, improved approach which better than the "conda", "run"
syntax.
Sample dockerfile available at this gist. It works by leveraging a custom entrypoint script to set up the environment before exec
ing the arguments of the RUN
stanza.
A shell is (put very simply) a process which can act as an entrypoint for arbitrary programs. exec "$@"
allows us to launch a new process, inheriting all of the environment of the parent process. In this case, this means we activate conda (which basically mangles a bunch of environment variables), then run /bin/bash -c CONTENTS_OF_DOCKER_RUN
.
Here is my previous approach, courtesy of Itamar Turner-Trauring; many thanks to them!
# Create the environment:
COPY environment.yml .
RUN conda env create -f environment.yml
# Set the default docker build shell to run as the conda wrapped process
SHELL ["conda", "run", "-n", "vigilant_detect", "/bin/bash", "-c"]
# Set your entrypoint to use the conda environment as well
ENTRYPOINT ["conda", "run", "-n", "myenv", "python", "run.py"]
Modifying ENV
may not be the best approach since conda likes to take control of environment variables itself. Additionally, your custom conda env may activate other scripts to further modulate the environment.
This leverages conda run to "add entries to PATH for the environment and run any activation scripts that the environment may contain" before starting the new bash shell.
Using conda can be a frustrating experience, since both tools effectively want to monopolize the environment, and theoretically, you shouldn't ever need conda inside a container. But deadlines and technical debt being a thing, sometimes you just gotta get it done, and sometimes conda is the easiest way to provision dependencies (looking at you, GDAL).
Piggybacking on ccauet's answer (which I couldn't get to work), and Charles Duffey's comment about there being more to it than just PATH
, here's what will take care of the issue.
When activating an environment, conda sets the following variables, as well as a few that backup default values that can be referenced when deactivating the environment. These variables have been omitted from the Dockerfile, as the root conda environment need never be used again. For reference, these are CONDA_PATH_BACKUP
, CONDA_PS1_BACKUP
, and _CONDA_SET_PROJ_LIB
. It also sets PS1
in order to show (testenv)
at the left of the terminal prompt line, which was also omitted. The following statements will do what you want.
ENV PATH /opt/conda/envs/testenv/bin:$PATH
ENV CONDA_DEFAULT_ENV testenv
ENV CONDA_PREFIX /opt/conda/envs/testenv
In order to shrink the number of layers created, you can combine these commands into a single ENV command setting all the variables at once as well.
There may be some other variables that need to be set, based on the package. For example,
ENV GDAL_DATA /opt/conda/envs/testenv/share/gdal
ENV CPL_ZIP_ENCODING UTF-8
ENV PROJ_LIB /opt/conda/envs/testenv/share/proj
The easy way to get this information is to call printenv > root_env.txt
in the root environment, activate testenv
, then call printenv > test_env.txt
, and examine
diff root_env.txt test_env.txt
.