Environment variables set up in Windows for pyspark

11,271

The Spark documentation is available. Don't be afraid, read it.

http://spark.apache.org/docs/1.6.0/configuration.html#environment-variables

Certain Spark settings can be configured through environment variables, which are read from ... conf\spark-env.cmd on Windows
...
PYSPARK_PYTHON   Python binary executable to use for PySpark in both driver and workers (default is python2.7 if available, otherwise python).
PYSPARK_DRIVER_PYTHON   Python binary executable to use for PySpark in driver only (default is PYSPARK_PYTHON).

Try something like this:

set PYSPARK_PYTHON=C:\Python27\bin\python.exe
pyspark
Share:
11,271
Sri
Author by

Sri

Updated on June 24, 2022

Comments

  • Sri
    Sri almost 2 years

    I have Spark installed in my laptop. And I am able to execute spark-shell command and open the scala shell as shown below:

    C:\Spark1_6\spark-1.6.0-bin-hadoop2.6\bin>spark-shell
    scala>
    

    But when I am trying to execute pyspark command:

    C:\Spark1_6\spark-1.6.0-bin-hadoop2.6\bin>pyspark
    

    I am getting the below error message:

    'python' is not recognized as an internal or external command

    I did set up the environment User 'Path' variable manually. By appending with

    ";C:\Python27"

    I rebooted the laptop and still get the same error. Can anyone please help me how to fix this ? Am I not correctly updating the environment variable?

    Versions: Spark: 1.6.2 Windows: 8.1

  • Sri
    Sri almost 7 years
    Hi Samson, I have given this command like set PYSPARK_PYTHON=C:\Python\python-3.6.1-amd64.exe because the .exe file is in 'C:Python' folder. And after that I entered pyspark then I get a new window asking for modify/repair/uninstall python.
  • Sri
    Sri almost 7 years
    Looks like I have downloaded python-3.6.1-amd64.exe which is not the correct one. I should download Windows x86-64 MSI installer. So I did installed it. Now I could see the folder Python27 in which I could see all files and folders related to Python. Now I tried to set the path as set PYSPARK_PYTHON=C:\Python27\bin\python.exe
  • Sri
    Sri almost 7 years
    I entered the command pyspark but getting the error message The system cannot find the path specified. Can any one please let me know what could be the issue.
  • Sri
    Sri almost 7 years
    I just gave the PYSPARK_PYTHON=C:\Python27\python.exe because my python.exe file is not in bin folder, so I removed it and set the path. After that pyspark command worked and could open the pyspark shell. Thanks