Only one SparkContext may be running in this JVM - [SPARK]

17,088

A Spark-shell already prepares a spark-session or spark-context for you to use - so you don't have to / can't initialize a new one. Usually you will have a line telling you under what variable it is available to you a the end of the spark-shell launch process. allowMultipleContexts exists only for testing some functionalities of Spark, and shouldn't be used in most cases.

Share:
17,088
trick15f
Author by

trick15f

Updated on June 16, 2022

Comments

  • trick15f
    trick15f almost 2 years

    I'm trying to run the following code to get twitter information live:

    import org.apache.spark._
    import org.apache.spark.streaming._
    import org.apache.spark.streaming.twitter._
    import org.apache.spark.streaming.StreamingContext._
    import twitter4j.auth.Authorization
    import twitter4j.Status
    import twitter4j.auth.AuthorizationFactory
    import twitter4j.conf.ConfigurationBuilder
    import org.apache.spark.streaming.api.java.JavaStreamingContext
    
    import org.apache.spark.rdd.RDD
    import org.apache.spark.SparkContext
    import org.apache.spark.mllib.feature.HashingTF
    import org.apache.spark.mllib.linalg.Vector
    import org.apache.spark.SparkConf
    import org.apache.spark.api.java.JavaSparkContext
    import org.apache.spark.api.java.function.Function
    import org.apache.spark.streaming.Duration
    import org.apache.spark.streaming.api.java.JavaDStream
    import org.apache.spark.streaming.api.java.JavaReceiverInputDStream
    
    val consumerKey = "xxx"
    val consumerSecret = "xxx"
    val accessToken = "xxx"
    val accessTokenSecret = "xxx"
    val url = "https://stream.twitter.com/1.1/statuses/filter.json"
    
    val sparkConf = new SparkConf().setAppName("Twitter Streaming")
    val sc = new SparkContext(sparkConf)
    
    val documents: RDD[Seq[String]] = sc.textFile("").map(_.split(" ").toSeq)
    
    
    // Twitter Streaming
    val ssc = new JavaStreamingContext(sc,Seconds(2))
    
    val conf = new ConfigurationBuilder()
    conf.setOAuthAccessToken(accessToken)
    conf.setOAuthAccessTokenSecret(accessTokenSecret)
    conf.setOAuthConsumerKey(consumerKey)
    conf.setOAuthConsumerSecret(consumerSecret)
    conf.setStreamBaseURL(url)
    conf.setSiteStreamBaseURL(url)
    
    val filter = Array("Twitter", "Hadoop", "Big Data")
    
    val auth = AuthorizationFactory.getInstance(conf.build())
    val tweets : JavaReceiverInputDStream[twitter4j.Status] = TwitterUtils.createStream(ssc, auth, filter)
    
    val statuses = tweets.dstream.map(status => status.getText)
    statuses.print()
    ssc.start()
    

    But when it arrives at this command: val sc = new SparkContext(sparkConf), the following error appears:

    17/05/09 09:08:35 WARN SparkContext: Multiple running SparkContexts detected in the same JVM! org.apache.spark.SparkException: Only one SparkContext may be running in this JVM (see SPARK-2243). To ignore this error, set spark.driver.allowMultipleContexts = true.

    I have tried to add the following parameters to the sparkConf value, but the error still appears:

    val sparkConf = new SparkConf().setAppName("Twitter Streaming").setMaster("local[4]").set("spark.driver.allowMultipleContexts", "true")
    

    If I ignore the error and continue running commands I get this other error:

    17/05/09 09:15:44 WARN ReceiverSupervisorImpl: Restarting receiver with delay 2000 ms: Error receiving tweets 401:Authentication credentials (https://dev.twitter.com/pages/auth) were missing or incorrect. Ensure that you have set valid consumer key/secret, access token/secret, and the system clock is in sync. \n\n\nError 401 Unauthorized HTTP ERROR: 401

    Problem accessing '/1.1/statuses/filter.json'. Reason:Unauthorized

    Any kind of contribution is appreciated. A greeting and have a good day.

  • trick15f
    trick15f about 7 years
    So the solution would be to omit the following commands: val sparkConf = new SparkConf().setAppName("Twitter Streaming") & val sc = new SparkContext(sparkConf)?. Thanks for the clarification.
  • Rick Moritz
    Rick Moritz about 7 years
    Yes - depending on your Spark version you may also have to substitute sc with spark.sparkContext (if Spark >=2.0)