What is version library spark supported SparkSession

28,520

Solution 1

you need both core and SQL artifacts

<repositories>
    <repository>
        <id>cloudera</id>
        <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
    </repository>
</repositories>
<dependencies>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.0.0-cloudera1-SNAPSHOT</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.11</artifactId>
        <version>2.0.0-cloudera1-SNAPSHOT</version>
    </dependency>
</dependencies> 

Solution 2

You need Spark 2.0 to use SparkSession. It's available in Maven central snapshot repository as for now:

groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-SNAPSHOT

The same version have to be specified for other Spark artifacts. Note, that 2.0 is still in beta and expected to be stable in about a month, AFAIK.

Update. Alternatively, you can use Cloudera fork of Spark 2.0:

groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-cloudera1-SNAPSHOT

Cloudera repository has to be specified in your Maven repositories list:

<repository>
   <id>cloudera</id>
   <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
Share:
28,520
RJK
Author by

RJK

Updated on September 02, 2020

Comments

  • RJK
    RJK over 3 years

    Code Spark with SparkSession.

       import org.apache.spark.SparkConf
       import org.apache.spark.SparkContext 
    
       val conf = SparkSession.builder
      .master("local")
      .appName("testing")
      .enableHiveSupport()  // <- enable Hive support.
      .getOrCreate()
    

    Code pom.xml

    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
        <modelVersion>4.0.0</modelVersion>
        <groupId>com.cms.spark</groupId>
        <artifactId>cms-spark</artifactId>
        <version>0.0.1-SNAPSHOT</version>
        <name>cms-spark</name>
    
        <pluginRepositories>
            <pluginRepository>
                <id>scala-tools.org</id>
                <name>Scala-tools Maven2 Repository</name>
                <url>http://scala-tools.org/repo-releases</url>
            </pluginRepository>
        </pluginRepositories>
    
        <dependencies>
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-core_2.10</artifactId>
                <version>1.6.0</version>
            </dependency>
    
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-sql_2.10</artifactId>
                <version>1.6.0</version>
            </dependency>
    
            <dependency>
                <groupId>com.databricks</groupId>
                <artifactId>spark-csv_2.10</artifactId>
                <version>1.4.0</version>
            </dependency>
    
            <dependency>
                <groupId>org.apache.spark</groupId>
                <artifactId>spark-hive_2.10</artifactId>
                <version>1.5.2</version>
            </dependency>
    
            <dependency>
                <groupId>org.jsoup</groupId>
                <artifactId>jsoup</artifactId>
                <version>1.8.3</version>
            </dependency>
    
        </dependencies>
    
        <build>
            <plugins>
                <plugin>
                    <artifactId>maven-assembly-plugin</artifactId>
                    <version>2.5.3</version>
                    <configuration>
                        <descriptorRefs>
                            <descriptorRef>jar-with-dependencies</descriptorRef>
                        </descriptorRefs>
                    </configuration>
                    <executions>
                        <execution>
                            <id>make-assembly</id> <!-- this is used for inheritance merges -->
                            <phase>install</phase> <!-- bind to the packaging phase -->
                            <goals>
                                <goal>single</goal>
                            </goals>
                        </execution>
                    </executions>
                </plugin>
            </plugins>
    
        </build>
    </project>
    

    I have some problem. I create code spark with SparkSession, iam get trouble SparkSession not find in library SparkSql. So iam can't run code spark. Iam question what is version to find SparkSession in library Spark. I give code pom.xml.

    Thanks.

  • RJK
    RJK almost 8 years
    haii @Vitaliy Kotlyarenko, i can't find in maven spark-core_2.11 in version 2.0.0. i add in maven : <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.0.0</version> </dependency> i get error, because i look in maven <artifactId>spark-core_2.11</artifactId> <version>1.6.1</version>, Last version 1.6.1. So any solution?
  • Vitalii Kotliarenko
    Vitalii Kotliarenko almost 8 years
    as it was mentioned, you have to specify version 2.0.0-SNAPSHOT, not 2.0.0
  • RJK
    RJK almost 8 years
    hail @Vitaliy Kotlyarenko, ok i get download jar spark_core_2.11 version 2.0.0-SNAPSHOT, iam success download jar. But i can't find import SparkSession. i try import org.apache.spark.SparkSession but iam get error. Can you help me?
  • Vitalii Kotliarenko
    Vitalii Kotliarenko almost 8 years
    did you download it manually or Maven resolved this from repository? It's not clear from your comment
  • RJK
    RJK almost 8 years
    i download from maven repository. But i find class SparkSession in import i can't find SparkSession.
  • RJK
    RJK almost 8 years
    my depedecies <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.0.0-SNAPSHOT</version> </dependency>, success, not error.
  • RJK
    RJK almost 8 years
    hail @Vitaliy Kotlyarenko, any from HortonWorks,? Because iam using Hadoop in Hortonworks. Thanks.
  • Vitalii Kotliarenko
    Vitalii Kotliarenko almost 8 years
    did you download jar file manually from maven repo? if so, you have ton install it into local repo
  • zero323
    zero323 almost 8 years
    It is not org.apache.spark.SparkSession. It is org.apache.spark.sql.SparkSession.
  • RJK
    RJK almost 8 years
    @zero323 i dont find org.apache.spark.sql.SparkSession, so iam get error. Any solution?
  • horatio1701d
    horatio1701d almost 8 years
    Wondering if you found a solution to this. I'm also running into same issue but when I use the latest spark from maven repo it does not include SparkSession.scala within sql. I'm using Only thing I am able to find is: <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>2.0.0-preview</version> </dependency>