how to install different python version in docker container

12,567

If I look at the Dockerfile here, it installs python3 which by default is installing python 3.5 for debian:stretch. You can instead install python 3.7 by editing the Dockerfile and building it yourself. In your Dockerfile, remove lines 19-25 and replace line 1 with the following, and then build the image locally.

FROM python:3.7-stretch

If you are not familiar with building your own image, download the Dockerfile and keep it in its own standalone directory. Then after cd into the directory, run the command below . You may want to first remove the already downloaded image. After this you should be able to run other docker commands the same way as if you had pulled the image from docker hub.

docker build -t gettyimages/spark .
Share:
12,567
Question-er XDD
Author by

Question-er XDD

Updated on June 08, 2022

Comments

  • Question-er XDD
    Question-er XDD almost 2 years

    I installed gettyimages/spark docker image and jupyter/pyspark-notebook inside my machine.

    However as the gettyimage/spark python version is 3.5.3 while jupyter/pyspark-notebook python version is 3.7, the following error come out:

    Exception: Python in worker has different version 3.5 than that in driver 3.7, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

    So, i have tried to upgrade the python version of gettyimage/spark image OR downgrade the python version of jupyter/pyspark-notebook docker image to fix it.

    • Lets talk about method 1, downgrade jupyter/pyspark-notebook python version first:

    I use conda install python=3.5 to downgrade the python version of jupyter/pyspark-notebook docker image. However, after i do so , my jupyter notebook cannot connect to any single ipynb and the kernel seems dead. Also, when i type conda again, it shows me conda command not found, but python terminal work well

    I have compare the sys.path before the downgrade and after it

    ['', '/usr/local/spark/python', '/usr/local/spark/python/lib/py4j-0.10.7-src.zip', '/opt/conda/lib/python35.zip', '/opt/conda/lib/python3.5', '/opt/conda/lib/python3.5/plat-linux', '/opt/conda/lib/python3.5/lib-dynload', '/opt/conda/lib/python3.5/site-packages']

    ['', '/usr/local/spark/python', '/usr/local/spark/python/lib/py4j-0.10.7-src.zip', '/opt/conda/lib/python37.zip', '/opt/conda/lib/python3.7', '/opt/conda/lib/python3.7/lib-dynload', '/opt/conda/lib/python3.7/site-packages']

    I think more or less, it is correct. So why i cannot use my jupyter notebook to connect to the kennel?

    • So i use another method, i tried upgrade the gettyimage/spark image

    sudo docker run -it gettyimages/spark:2.4.1-hadoop-3.0 apt-get install python3.7.3 ; python3 -v

    However, I find that even i do so, i cannot run the spark well.

    I am not quite sure what to do. May you share with me how to modify the docker images internal package version