ERROR yarn.ApplicationMaster: User class threw exception: java.lang.reflect.InvocationTargetException java.lang.reflect.InvocationTargetException

13,490

It was problem related to IO with snapply so when I submitted job like this: ./bin/spark-submit --class com.demo.WordCountSimple --master yarn-cluster --num-executors 8 --executor-memory 4g --executor-cores 10 --conf spark.io.compression.codec=lz4 /users/hastimal/WordCountSimple.jar /test/sample.txt /test/output I was able to do successfully!! Thanks @zsxwing

Share:
13,490
ChikuMiku
Author by

ChikuMiku

Updated on June 04, 2022

Comments

  • ChikuMiku
    ChikuMiku almost 2 years

    I'm getting error when I run Spark job on yarn cluster. I made jar couple of times and ran successfully. I don't know this time I'm not able to run even a simple WordCount program. Here is the error that I'm getting.

    16/04/06 20:38:13 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
    16/04/06 20:38:13 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done
    16/04/06 20:38:13 ERROR yarn.ApplicationMaster: User class threw exception: null
    java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:68)
        at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
        at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
        at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:79)
        at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
        at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
        at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
        at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1051)
        at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:761)
        at org.apache.spark.SparkContext.textFile(SparkContext.scala:589)
        at com.demo.WordCountSimple$.main(WordCountSimple.scala:24)
        at com.demo.WordCountSimple.main(WordCountSimple.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:480)
    Caused by: java.lang.IllegalArgumentException
        at org.apache.spark.io.SnappyCompressionCodec.<init>(CompressionCodec.scala:152)
        ... 21 more
    16/04/06 20:38:13 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: null)
    16/04/06 20:38:13 INFO yarn.ApplicationMaster: Invoking sc stop from shutdown hook
    16/04/06 20:38:13 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
    16/04/06 20:38:13 INFO handler.ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
    

    I am using Spark 1.6.0 with Scala 2.11.7 and my sbt is as below-

    import sbt.Keys._
    
    lazy val root = (project in file(".")).
      settings(
        name := "SparkTutorials",
        version := "1.0",
        scalaVersion := "2.11.7",
        mainClass in Compile := Some("WordCountSimple")
      )
    
    exportJars := true
    fork := true
    
    libraryDependencies ++= Seq(
      "org.apache.spark" %% "spark-core" % "1.6.0" % "provided",
      "org.apache.spark" %% "spark-streaming" % "1.6.0",
      "org.apache.spark" %% "spark-mllib" % "1.6.0",
    "org.apache.spark" %% "spark-sql" % "1.6.0")
    
    assemblyJarName := "WordCountSimple.jar"
    //
    val meta = """META.INF(.)*""".r
    
    assemblyMergeStrategy in assembly := {
      case PathList("javax", "servlet", xs@_*) => MergeStrategy.first
      case PathList(ps@_*) if ps.last endsWith ".html" => MergeStrategy.first
      case n if n.startsWith("reference.conf") => MergeStrategy.concat
      case n if n.endsWith(".conf") => MergeStrategy.concat
      case meta(_) => MergeStrategy.discard
      case x => MergeStrategy.first
    }
    

    During jar submit, I'm doing like this-

    ./bin/spark-submit --class  com.demo.WordCountSimple  --master yarn-cluster  --num-executors 8 --executor-memory 4g --executor-cores 10 /users/hastimal/WordCountSimple.jar   /test/sample.txt /test/output 
    

    I'm doing some other stuff using Spark GraphX but as that was showing same error so I thought to do first WordCount testing. Still same error. I followed link and also stack but no luck. Is there any problem in Jar? or any problem in Cluster? or any problem in Dependencies? Please help me!!

    FYI: Code-

    package com.demo
    
    import java.util.Calendar
    
    import org.apache.spark.{SparkContext, SparkConf}
    
    /**
     * Created by hastimal on 3/14/2016.
     */
    object WordCountSimple {
      def main(args: Array[String]) {
        //System.setProperty("hadoop.home.dir","F:\\winutils")
        if (args.length < 2) {
          System.err.println("Usage: WordCountSimple <inputPath> <outputPath>")
          System.exit(1)
        }
        val inputPath = args(0)   //input path as variable argument
        val outputPath = args(1)  //output path as variable argument
        // Create a Scala Spark Context.
        val conf = new SparkConf().setAppName("WordCountSimple")
        val sc = new SparkContext(conf)
        val startTime = Calendar.getInstance().getTime()
        println("startTime "+startTime)
    //    val input = sc.textFile(inputPath,8)
        val input = sc.textFile(inputPath,4)
          // Split it up into words.
        val words = input.flatMap(line => line.split(" "))
        val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}
        counts.saveAsTextFile(outputPath)
        //counts.foreach(println(_))
        val endTime = Calendar.getInstance().getTime()
        println("endTime "+endTime)
        val totalTime = endTime.getTime-startTime.getTime
        println("totalTime "+totalTime)
      }
    }