How to debug a scala based Spark program on Intellij IDEA
Solution 1
First define environment variable like below
export SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=7777
Then create the Debug configuration in Intellij Idea as follows
Rub-> Edit Configuration -> Click on "+" left top cornor -> Remote -> set port and name
After above configuration run spark application with spark-submit or sbt run and then run debug which is created in configuration. and add checkpoints for debug.
Solution 2
If you're using the scala plugin and have your project configured as an sbt project, it should basically work out of the box.
Go to Run
->Edit Configurations...
and add your run configuration normally.
Since you have a main
class, you probably want to add a new Application
configuration.
You can also just click on the blue square icon, to the left of your main
code.
Once your run configuration is set up, you can use the Debug feature.
Solution 3
I've run into this when I switch between 2.10 and 2.11. SBT expects the primary object to be in src->main->scala-2.10 or src->main->scala-2.11 depending on your version.
lserlohn
Updated on June 23, 2022Comments
-
lserlohn almost 2 years
I am currently building my development IDE using Intellij IDEA. I followed exactly the same way as http://spark.apache.org/docs/latest/quick-start.html
build.sbt file
name := "Simple Project" version := "1.0" scalaVersion := "2.11.7" libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0"
Sample Program File
import org.apache.spark.SparkContext import org.apache.spark.SparkContext._ import org.apache.spark.SparkConf object MySpark { def main(args: Array[String]){ val logFile = "/IdeaProjects/hello/testfile.txt" val conf = new SparkConf().setAppName("Simple Application") val sc = new SparkContext(conf) val logData = sc.textFile(logFile, 2).cache() val numAs = logData.filter(line => line.contains("a")).count() val numBs = logData.filter(line => line.contains("b")).count() println("Lines with a: %s, Lines with b: %s".format(numAs, numBs)) } }
If I use command line:
sbt package
and then
spark-submit --class "MySpark" --master local[4] target/scala-2.11/myspark_2.11-1.0.jar
I am able to generate jar package and spark runs well.
However, I want to use Intellij IDEA to debug the program in the IDE. How can I setup the configuration, so that if I click "debug", it will automatically generate the jar package and automatically launch the task by executing "spark-submit-" command line.
I just want everything could be simple as "one click" on the debug button in Intellij IDEA.
Thanks.
-
lserlohn over 7 yearsThanks. I created a sbt project and put the script file there. If I run directly on it, it would say "Exception in thread "main" java.lang.ClassNotFoundException: MySpark" Could you let me know exactly how could I set the parameters?
-
lserlohn over 7 yearsThanks for answering. How can I find the port and name of my local spark?
-
lserlohn over 7 yearsis the port and name "localhost:7077"? I got "Error running Spark: Unable to open debugger port (localhost:7077): java.net.ConnectException "Connection refused""
-
Sandeep Purohit over 7 yearsyou should not put spark port you can simply put port same as address in SPARK_SUBMIT_OPTS in above SPARK_SUBMIT_OPTS you can see address =7777 so now simply change port in remote configuration and add one checkpoint in ur code now sun spark-sumit and its show msg listening port 7777 then go to itellij and click on debug which u create
-
Krishna Pandey over 6 yearsI used below to get it working. export SPARK_DAEMON_JAVA_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=<port_no>
-
ecoe about 6 yearsThis solution has a full step-by-step tutorial (with pictures) here: bigendiandata.com/…
-
ammills01 almost 6 years@Iserlohn when you setup your SparkConf() you have set the master as "local[2]" before you pass your SparkConf into the SparkContext. If your SparkConf were named sparkConf it would look like this: sparkConf.setMaster("local[2]") This only applies to debugging through the IDE. If you leave this in by accident when you deploy your code to the server it will not behave correctly.
-
Dims about 3 yearsHow to ensure that the source code of Spark installed is the same as used inside JAR?