Java gateway process exited before sending its port number Spark
After one week looking for different ways to solve the exception showed, finally I found another tutorial, but this solved my question, the answer is Anaconda is the problem, the same variables and paths are the same. Then I install notebook python directly in my Windows (without Anaconda), now the issue was solved.
mikesneider
I'm a master student in system engineering. I'm working with satelital position with constellations GPS and GLONASS, i like math, stats and computer science.
Updated on August 03, 2022Comments
-
mikesneider over 1 year
I am trying to install Spark in my Windows 10 with Anaconda, but I got an error when I try to runs pyspark in my JupyterNotebook. I am following the steps in this tutorial. Then, I already download Java 8 and install, Spark 3.0.0, Hadoop 2.7.
I already set the paths for SPARK_HOME, JAVA_HOME, and include the '/bin' paths in the "PATH" environment.
C:\Users\mikes>java -version java version "1.8.0_251" Java(TM) SE Runtime Environment (build 1.8.0_251-b08) Java HotSpot(TM) 64-Bit Server VM (build 25.251-b08, mixed mode)
In PowerShell of Anaconda pyspark it works.
(base) PS C:\Users\mikes> pyspark Python 3.6.5 |Anaconda, Inc.| (default, Mar 29 2018, 13:32:41) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. 20/06/05 07:14:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Welcome to ____ __ / __/__ ___ _____/ /__ _ \ \/ _ \/ _ `/ __/ '_/ /__ / .__/\_,_/_/ /_/\_\ version 3.0.0-preview2 /_/ Using Python version 3.6.5 (default, Mar 29 2018 13:32:41) SparkSession available as 'spark'. >>> >>> nums = sc.parallelize([1,2,3,4]) >>> nums.map(lambda x: x*x).collect() [1, 4, 9, 16] >>>
Netx step is runs pyspark in my Jupyter Notebook. I already install
findspark
then, my code for start in:import findspark findspark.init('c:\spark\spark-3.0.0-preview2-bin-hadoop2.7') #doesent work findspark.init() is necessary write the path. findspark.find() import pyspark from pyspark import SparkContext, SparkConf from pyspark.sql import SparkSession conf = pyspark.SparkConf().setAppName('appName').setMaster('local') sc = pyspark.SparkContext(conf=conf) #Here is the error spark = SparkSession(sc)
The error that shows:
--------------------------------------------------------------------------- Exception Traceback (most recent call last) <ipython-input-6-c561ad39905c> in <module>() 4 conf = pyspark.SparkConf().setAppName('appName').setMaster('local') 5 sc = pyspark.SparkConf() ----> 6 sc = pyspark.SparkContext(conf=conf) 7 spark = SparkSession(sc) c:\spark\spark-3.0.0-preview2-bin-hadoop2.7\python\pyspark\context.py in __init__(self, master, appName, sparkHome, pyFiles, environment, batchSize, serializer, conf, gateway, jsc, profiler_cls) 125 " is not allowed as it is a security risk.") 126 --> 127 SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) 128 try: 129 self._do_init(master, appName, sparkHome, pyFiles, environment, batchSize, serializer, c:\spark\spark-3.0.0-preview2-bin-hadoop2.7\python\pyspark\context.py in _ensure_initialized(cls, instance, gateway, conf) 317 with SparkContext._lock: 318 if not SparkContext._gateway: --> 319 SparkContext._gateway = gateway or launch_gateway(conf) 320 SparkContext._jvm = SparkContext._gateway.jvm 321 c:\spark\spark-3.0.0-preview2-bin-hadoop2.7\python\pyspark\java_gateway.py in launch_gateway(conf, popen_kwargs) 103 104 if not os.path.isfile(conn_info_file): --> 105 raise Exception("Java gateway process exited before sending its port number") 106 107 with open(conn_info_file, "rb") as info: Exception: Java gateway process exited before sending its port number
I saw another question similar to this one, but maybe the situation is another, because I already tried those solutions, as:
-Set another party for
PYSPARK_SUBMIT_ARGS
, but I do not know if I a doing wrong.os.environ['PYSPARK_SUBMIT_ARGS']= "--master spark://localhost:8888"
the other solutions are: - Set path for
JAVA_HOME, SPARK_HOME
(already did it) - Install Java 8 (not 10)I already spend some hours trying, even a reinstall Anaconda because I delete an environment.