Welcome back! In one of our previous tutorials, we learned how to use Microsoft Visual Studio Code to build our first containers. In the tutorial, we constructed a container based on an Alpine image, version 3.14. Given the size of the image, however, we did not have access to any coding languages such as Python or Miniconda. Without Conda, we are unable to install a Conda environment! So let’s revisit that blog post and see how we can build a customized environment within a Docker container.
The main idea behind the setup we’ll be building is this:
- Build a single Docker container; the container’s image should have Conda installed.
- The container will have an
/smeedirectory which will house an installation directory (
- Within the
installationdirectory, we have two new files:
- The container will be built using Docker.
The steps are pretty straight forward, but some of our files could use some additional explanation. In the interest of transparency, our folder hierarchy will look like something like this. We have a folder (“docker_intro”) housing a Docker file (
dockerfile), a Compose file (
docker-compose.yml), and a subdirectory titled
smee folder contains a folder titled
installation; in turn,
/smee/installation houses two new files: one to define our environment (
environment.yml) and one to define other packages we’d like installed (
requirements.txt). Now we need to describe the the contents of each file!
Since we now want an Anaconda environment, we’ll use the latest miniconda image from the coninuumio portion of Docker Hub. We won’t be using a multi-stage build here, so we’ll just import the image as is without an
We also need to copy files from our host directory into the container. So, we transition into the work directory
/smee in our container and copy everything from our current host folder into the working directory. Finally, we define our base stage as our test stage. Within the test stage of our Docker file, we issue a command within the container to run the
test.py file located within the
/smee directory using the
# Use the latest miniconda image FROM continuumio/miniconda:latest # Define & enter the installation directory within our container WORKDIR /installation # Copy contents of the local installation directory into the container directory COPY ./smee/installation/ . # Create a Conda environment based on our YAML file RUN conda env create -f environment.yml # Update conda, activate the new environment, and install our required packages RUN conda init bash \ && . ~/.bashrc \ && conda update conda \ && conda activate myenv \ && pip install -r /installation/requirements.txt
This is the first YAML (Yet Another Machine Language) file we’ve encountered so far, but it’s still pretty readable. Coincidentally, this is also one of the benefits of YAML files! In the first section, we define our environment name (
myenv), and we define
channels. In this context, channels are all the locations where we can expect to find the packages and dependencies we define later in our YAML file.
The dependencies list consists of a basic format: each line is one package with the format:
- <package name>=<version number>=<package reference>. Using this format, we can list any number of packages, each specified by a very particular release number. Finally, we define a prefix — this is the location in which we’ll be installing our environment within the container. Here, we only show a handful of packages to illustrate the point; these packages include things like HDF5, NumPy, OpenCV, and the MATLAB plotting library,
name: myenv channels: - defaults dependencies: - hdf5=1.10.2=hba1933b_1 - imageio=2.9.0=pyhd3eb1b0_0 - matplotlib=3.3.4=py37h06a4308_0 - matplotlib-base=3.3.4=py37h62a2d02_0 - numpy=1.20.1=py37h93e21f0_0 - numpy-base=1.20.1=py37h7d8b39e_0 - opencv=3.4.2=py37h6fd60c2_1 prefix: /proj/myenv/users/yla0111/anaconda3/envs/myenv
Required packages file:
Similar to the YML file, the
requriements.txt file simply lists additional packages we would like to be installed within our Conda environment. It really is just as simple as listing more packages we need to have, but this time we do not need to specify package versions! When Conda fetches these packages, it will grab the latest version for us automatically — this is a nice way to make sure that every time the container is built, we’re always using the most updated packages we can get.
numpy scipy xarray
Putting it all together: Docker
So now we’ve defined the files that will build our environment, install additional packages, and the Docker file that will put all this together for us. Similar to our previous tutorial on building Docker containers in VS Code, we’ll tell Docker to build our container.
vscode ➜ /com.docker $ docker build -t "environ:v0" . [+] Building 737.2s (10/10) FINISHED => [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 726B 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/continuumio/miniconda:latest 1.9s => [1/5] FROM docker.io/continuumio/miniconda:latest@sha256:fee1354ae2435522b9a8a79c5f1c406facc07ec5c44d730d8053600b37c924f0 154.5s < other jargon > => [2/5] WORKDIR /installation 0.6s => [3/5] COPY ./smee/installation/ . 0.0s => [4/5] RUN conda env create -f environment.yml 480.0s => [5/5] RUN conda init bash && . ~/.bashrc 87.7s => exporting to image 12.4s => => exporting layers 12.4s => => writing image sha256:e0d9f82b4c418ced4ae3dc34d971fd618286d532ebefaeae5eb75abd615f1dcc 0.0s => => naming to docker.io/library/environ:v0 0.0s
Fantastic! Everything seems to have run smoothly. Let’s go ahead and enter the container and have a look around using
docker run -it environ:v0. When we do this, we get an output with an interesting prefix — our directory prompt (
root@5d8ba7fdfb87) is prefaced with
(base) — this tells us that we have entered the container, but we are in the
base environment, and not our Conda environment,
myenv. Let’s have a look around anyways, shall we? Let’s go ahead and list all the packages in the base environment.
As we begin to look for the packages we specified in our requirements.txt file (ex.
xarray), we notice that they are not in this list! This makes sense though, since we asked for those packages to be installed within the Conda environment
myenv, and not the
base environment. So it makes sense that we do not see packages like
xarray listed in the list of Conda modules of the
# Run container vscode ➜ /com.docker $ docker run -it environ:v0 (base) root@5d8ba7fdfb87:/installation# ls environment.yml requirements.txt # Show all files in Base environment (base) root@279718979ef2:/installation# conda list # packages in environment at /opt/conda: # # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu asn1crypto 1.4.0 py_0 ca-certificates 2022.4.26 h06a4308_0 certifi 2020.6.20 pyhd3eb1b0_3 cffi 1.12.3 py27h2e261b9_0 chardet 3.0.4 py27_1003 colorama 0.4.4 pyhd3eb1b0_0 conda 4.7.12 py27_0 conda-package-handling 1.6.0 py27h7b6447c_0 cryptography 2.7 py27h1ba5d50_0 enum34 1.1.6 py27_1 futures 3.3.0 py27_0 idna 2.8 py27_0 ipaddress 1.0.23 py_0 libedit 3.1.20210910 h7f8727e_0 libffi 3.4.2 h295c915_4 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libstdcxx-ng 11.2.0 h1234567_1 ncurses 6.3 h7f8727e_2 openssl 1.1.1o h7f8727e_0 pycosat 0.6.3 py27h7b6447c_0 pycparser 2.19 py27_0 pyopenssl 19.0.0 py27_0 pysocks 1.7.1 py27_0 python 2.7.16 h9bab390_7 readline 7.0 h7b6447c_5 requests 2.22.0 py27_0 ruamel_yaml 0.15.46 py27h14c3975_0 setuptools 41.4.0 py27_0 six 1.16.0 pyhd3eb1b0_1 sqlite 3.30.0 h7b6447c_0 tk 8.6.12 h1ccaba5_0 tqdm 4.63.0 pyhd3eb1b0_0 urllib3 1.24.2 py27_0 yaml 0.1.7 had09818_2 zlib 1.2.12 h7f8727e_2
Now, let’s go ahead and activate our
myenv environment and check out the same package list. First, we notice that when we use
conda activate myenv, the environment
(base) is now replaced with
(myenv) — so we have definitely installed our environment correctly, and we’ve been placed inside it!
Furthermore, if we look at packages installed in our Anaconda distribution (
conda list), we have a much larger number of packages; among these packages are
xarray — the three modules we specifically requested to be installed within our Conda environment! Lastly, we confirm that very specific version of, say, OpenCV that we specified within our
environments.yml file (version 3.4.2) has also successfully been installed within our Conda environment within our container.
# Change to myenv Conda environment (base) root@5d8ba7fdfb87:/# conda activate myenv (myenv) root@5d8ba7fdfb87:/# ls (myenv) root@279718979ef2:/installation# conda list # packages in environment at /opt/conda/envs/myenv: # # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu blas 1.0 mkl bzip2 1.0.8 h7b6447c_0 ca-certificates 2022.4.26 h06a4308_0 cairo 1.16.0 h19f5f5c_2 certifi 2022.5.18.1 py37h06a4308_0 cycler 0.11.0 pyhd3eb1b0_0 dbus 1.13.18 hb2f20db_0 expat 2.4.4 h295c915_0 ffmpeg 4.0 hcdf2ecd_0 fontconfig 2.13.1 h6c09931_0 freeglut 3.0.0 hf484d3e_5 freetype 2.11.0 h70c0345_0 giflib 5.2.1 h7b6447c_0 glib 2.69.1 h4ff587b_1 graphite2 1.3.14 h295c915_1 gst-plugins-base 1.14.0 h8213a91_2 gstreamer 1.14.0 h28cd5cc_2 harfbuzz 1.8.8 hffaf4a1_0 hdf5 1.10.2 hba1933b_1 icu 58.2 he6710b0_3 imageio 2.9.0 pyhd3eb1b0_0 importlib-metadata 4.11.4 pypi_0 pypi intel-openmp 2021.4.0 h06a4308_3561 jasper 2.0.14 hd8c5072_2 jpeg 9e h7f8727e_0 kiwisolver 1.4.2 py37h295c915_0 lcms2 2.12 h3be6417_0 ld_impl_linux-64 2.38 h1181459_1 libffi 3.3 he6710b0_2 libgcc-ng 11.2.0 h1234567_1 libgfortran-ng 7.3.0 hdf63c60_0 libglu 9.0.0 hf484d3e_1 libgomp 11.2.0 h1234567_1 libopencv 3.4.2 hb342d67_1 libopus 1.3.1 h7b6447c_0 libpng 1.6.37 hbc83047_0 libstdcxx-ng 11.2.0 h1234567_1 libtiff 4.2.0 h2818925_1 libuuid 1.0.3 h7f8727e_2 libvpx 1.7.0 h439df22_0 libwebp 1.2.2 h55f646e_0 libwebp-base 1.2.2 h7f8727e_0 libxcb 1.15 h7f8727e_0 libxml2 2.9.14 h74e7548_0 lz4-c 1.9.3 h295c915_1 matplotlib 3.3.4 py37h06a4308_0 matplotlib-base 3.3.4 py37h62a2d02_0 mkl 2021.4.0 h06a4308_640 mkl-service 2.4.0 py37h7f8727e_0 mkl_fft 1.3.1 py37hd3c417c_0 mkl_random 1.2.2 py37h51133e4_0 ncurses 6.3 h7f8727e_2 numpy 1.20.1 py37h93e21f0_0 numpy-base 1.20.1 py37h7d8b39e_0 opencv 3.4.2 py37h6fd60c2_1 openssl 1.1.1o h7f8727e_0 pandas 1.3.5 pypi_0 pypi pcre 8.45 h295c915_0 pillow 9.0.1 py37h22f2fdc_0 pip 21.2.2 py37h06a4308_0 pixman 0.40.0 h7f8727e_1 py-opencv 3.4.2 py37hb342d67_1 pyparsing 3.0.4 pyhd3eb1b0_0 pyqt 5.9.2 py37h05f1152_2 python 3.7.13 h12debd9_0 python-dateutil 2.8.2 pyhd3eb1b0_0 pytz 2022.1 pypi_0 pypi qt 5.9.7 h5867ecd_1 readline 8.1.2 h7f8727e_1 scipy 1.7.3 pypi_0 pypi setuptools 61.2.0 py37h06a4308_0 sip 4.19.8 py37hf484d3e_0 six 1.16.0 pyhd3eb1b0_1 sqlite 3.38.3 hc218d9a_0 tk 8.6.12 h1ccaba5_0 tornado 6.1 py37h27cfd23_0 typing_extensions 4.1.1 pyh06a4308_0 wheel 0.37.1 pyhd3eb1b0_0 xarray 0.20.2 pypi_0 pypi xz 5.2.5 h7f8727e_1 zipp 3.8.0 pypi_0 pypi zlib 1.2.12 h7f8727e_2 zstd 1.5.2 ha4553b6_0
In this tutorial, we’ve changed things up a bit. We’ve swapped out our Python image for an Anaconda image from Docker Hub, and we’ve installed a custom-named Conda environment within a Docker container. To validate the environment’s existence, we entered the Docker container, and compared the packages installed within the
base environment and our custom-built environment,
myenv. In doing so, we validated that packages from both our
requirements.txt file and our
environment.yml file existed only in our specified environment!
In our next tutorial, we’re going to take a step back to discuss how we can save, load, and move these images around. Given the functionality we’re starting to gain, it would be a shame to not share the images with others! Until then, thanks again for learning with me — we’re all in this together! If you’re enjoying the content, please feel free to Like, comment, and subscribe — see you next time!
Get new content delivered directly to your inbox.
(Header image: 3d studio by benzoix)