What is version library spark supported SparkSession
28,520
Solution 1
you need both core and SQL artifacts
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
</dependencies>
Solution 2
You need Spark 2.0 to use SparkSession. It's available in Maven central snapshot repository as for now:
groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-SNAPSHOT
The same version have to be specified for other Spark artifacts. Note, that 2.0 is still in beta and expected to be stable in about a month, AFAIK.
Update. Alternatively, you can use Cloudera fork of Spark 2.0:
groupId = org.apache.spark
artifactId = spark-core_2.11
version = 2.0.0-cloudera1-SNAPSHOT
Cloudera repository has to be specified in your Maven repositories list:
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
Author by
RJK
Updated on September 02, 2020Comments
-
RJK over 3 years
Code Spark with SparkSession.
import org.apache.spark.SparkConf import org.apache.spark.SparkContext val conf = SparkSession.builder .master("local") .appName("testing") .enableHiveSupport() // <- enable Hive support. .getOrCreate()
Code pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.cms.spark</groupId> <artifactId>cms-spark</artifactId> <version>0.0.1-SNAPSHOT</version> <name>cms-spark</name> <pluginRepositories> <pluginRepository> <id>scala-tools.org</id> <name>Scala-tools Maven2 Repository</name> <url>http://scala-tools.org/repo-releases</url> </pluginRepository> </pluginRepositories> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.6.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.10</artifactId> <version>1.6.0</version> </dependency> <dependency> <groupId>com.databricks</groupId> <artifactId>spark-csv_2.10</artifactId> <version>1.4.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-hive_2.10</artifactId> <version>1.5.2</version> </dependency> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.8.3</version> </dependency> </dependencies> <build> <plugins> <plugin> <artifactId>maven-assembly-plugin</artifactId> <version>2.5.3</version> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <!-- this is used for inheritance merges --> <phase>install</phase> <!-- bind to the packaging phase --> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> </plugins> </build> </project>
I have some problem. I create code spark with SparkSession, iam get trouble SparkSession not find in library SparkSql. So iam can't run code spark. Iam question what is version to find SparkSession in library Spark. I give code pom.xml.
Thanks.
-
RJK almost 8 yearshaii @Vitaliy Kotlyarenko, i can't find in maven spark-core_2.11 in version 2.0.0. i add in maven : <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.0.0</version> </dependency> i get error, because i look in maven <artifactId>spark-core_2.11</artifactId> <version>1.6.1</version>, Last version 1.6.1. So any solution?
-
Vitalii Kotliarenko almost 8 yearsas it was mentioned, you have to specify version 2.0.0-SNAPSHOT, not 2.0.0
-
RJK almost 8 yearshail @Vitaliy Kotlyarenko, ok i get download jar spark_core_2.11 version 2.0.0-SNAPSHOT, iam success download jar. But i can't find import SparkSession. i try import org.apache.spark.SparkSession but iam get error. Can you help me?
-
Vitalii Kotliarenko almost 8 yearsdid you download it manually or Maven resolved this from repository? It's not clear from your comment
-
RJK almost 8 yearsi download from maven repository. But i find class SparkSession in import i can't find SparkSession.
-
RJK almost 8 yearsmy depedecies <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.0.0-SNAPSHOT</version> </dependency>, success, not error.
-
RJK almost 8 yearshail @Vitaliy Kotlyarenko, any from HortonWorks,? Because iam using Hadoop in Hortonworks. Thanks.
-
Vitalii Kotliarenko almost 8 yearsdid you download jar file manually from maven repo? if so, you have ton install it into local repo
-
zero323 almost 8 yearsIt is not
org.apache.spark.SparkSession
. It isorg.apache.spark.sql.SparkSession
. -
RJK almost 8 years@zero323 i dont find org.apache.spark.sql.SparkSession, so iam get error. Any solution?
-
horatio1701d almost 8 yearsWondering if you found a solution to this. I'm also running into same issue but when I use the latest spark from maven repo it does not include SparkSession.scala within sql. I'm using Only thing I am able to find is:
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>2.0.0-preview</version> </dependency>