How to install a python package with all the dependencies into a Docker image?

24,825

To answer the first question Where is the conda environment? we just need to execute in console $ docker my_containers_name ls /opt/conda.

Second question has two options:

  • We can open the containers console by executing the command

    $ docker exec -it my_containers_name /bin/bash

    and install the package like a normal conda package

    conda install --channel https://conda.anaconda.org/conda-forge folium

  • We can modify the Dockerfile of the Docker image or create a new one extending the previous one. To create a new Dockerfile and add the lines

    FROM jupyter/minimal-notebook
    USER jovyan
    RUN conda install --quiet --yes --channel https://conda.anaconda.org/conda-forge folium && conda clean -tipsy
    

    And build our new image. If we want to modify the original Dockerfile we must skip the first line.

I create my own Dockerfile by forking the original project.

Thanks warmoverflow and ShanShan for your comments

Share:
24,825

Related videos on Youtube

pax
Author by

pax

Curious Mathematician with experience in Data Science & Analytics I'm leading the implementation of data driven platforms and products using Machine Learning, Big Data and Data Streaming technologies: Apache Spark, Cassandra, Hadoop, Kafka, Django, Flask, Docker, Kubernetes, scikit-learn, Tensorflow, Databricks. All this over AWS and Azure. Python and Javascript are my weapons of choice. I love to bring in creative and innovative client oriented solutions, aligned to team capacities, time to market, and costs optimization. I always approach challenges with a problem-solving mindset. I believe in measuring achievements through satisfaction of the client and the team.

Updated on July 09, 2022

Comments

  • pax
    pax almost 2 years

    I'm working in Ubuntu 15.10 with the Docker container for Pyspark jupyter/pyspark-notebook. I need to install folium with all it's dependencies and run a Pyspark script into the container. I successfully installed Docker, pulled the image and run it with the command

    docker run -d -p 8888:8888 -p 4040:4040 -v /home/$MYUSER/$MYPROJECT:/home/jovyan/work jupyter/pyspark-notebook
    

    Then, I execute the code example without any issues

    import pyspark
    sc = pyspark.SparkContext('local[*]')
    
    # do something to prove it works
    rdd = sc.parallelize(range(1000))
    rdd.takeSample(False, 5)
    

    I looked for the conda environment in /opt/conda (as it says in the documentation) but there is no conda in my /opt folder. Then, I installed miniconda3 and folium with all the dependencies as a normal Python package (no Docker involved).

    It doesn't work. When I run the image and try to import the package with import folium it doesn't find the folium package:

    ImportErrorTraceback (most recent call last)
    <ipython-input-1-af6e4f19ef00> in <module>()
    ----> 1 import folium
    
    ImportError: No module named 'folium'
    

    So the problem can be reduced to two questions:

    1. Where is the container's conda?
    2. How can I install the Python package I need into the container?
    • Xiongbing Jin
      Xiongbing Jin almost 8 years
      To install python pacakges into Docker container, you can either create a new Dockerfile FROM jupyter/pyspark-notebook and add conda install --quiet --yes 'folium', or just login to the container sudo docker exec -it container_id /bin/bash and install directly inside the container (first method preferred)
    • Shanoor
      Shanoor almost 8 years
      A Docker container is isolated, it doesn't see anything installed on your machine. You need a Dockerfile where you'll state the command to install folium, just like warmoverflow commented. Don't use the second method, a container reverts to its initial state when restarted, you lose any change made directly inside a running container.
    • pax
      pax almost 8 years
      Thanks warmoverflow and ShanShan for your comments!. I didn't understand that the container has its own file system. I did $ docker my_containers_name ls /opt/conda and found the conda enviroment