How to spark-submit with main class in jar?
Solution 1
Afraid none of these were the issue. I had previously tried deleting everything in the project and starting over, but that didn't work either. Once it occurred to me start an entirely different project, it worked just fine. Apparently Intellij (of which I am a fan) decided to create a hidden problem somewhere.
Solution 2
Why don't you use the path to the jar file so spark-submit
(as any other command line tool) could find and use it?
Given the path out/artifacts/TimeSeriesFilter_jar/scala-ts.jar
I'd use the following:
spark-submit --class com.stronghold.HelloWorld out/artifacts/TimeSeriesFilter_jar/scala-ts.jar
Please note that you should be in the project's main directory which seems to be /home/[USER]/projects/scala_ts
.
Please also note that I removed --master local[*]
since that's the default master URL spark-submit
uses.
Marvin Ward Jr
I am a researcher with the JPMorgan Chase Institute. We use de-identifed data that is administratively collected by the bank to gain insight into the economic decisions of households, firms, and market actors. All of our work is available for free on our website (including some summary data!). Personally, I lead the local commerce work which relies on billions of credit and debit transactions to explore consumption activity within 14 metro areas in the US. Before joining JPMC in 2016, I had the great fortune to work with the folks in the DC Office of Revenue Analysis and the Tax Analysis Division of the Congressional Budget Office.
Updated on June 30, 2022Comments
-
Marvin Ward Jr almost 2 years
There are a ton of questions about
ClassNotFoundException
but I haven't seen any (yet) that fit this specific case. I am attempting to run the following command:spark-submit --master local[*] --class com.stronghold.HelloWorld scala-ts.jar
It throws the following exception:
\u@\h:\w$ spark_submit --class com.stronghold.HelloWorld scala-ts.jar ⬡ 9.8.0 [±master ●●●] 2018-05-06 19:52:33 WARN Utils:66 - Your hostname, asusTax resolves to a loopback address: 127.0.1.1; using 192.168.1.184 instead (on interface p1p1) 2018-05-06 19:52:33 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind to another address 2018-05-06 19:52:33 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable java.lang.ClassNotFoundException: com.stronghold.HelloWorld at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.util.Utils$.classForName(Utils.scala:235) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:836) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 2018-05-06 19:52:34 INFO ShutdownHookManager:54 - Shutdown hook called 2018-05-06 19:52:34 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-e8a77988-d30c-4e96-81fe-bcaf5d565c75
However, the jar clearly contains this class:
1 " zip.vim version v28 1 " Browsing zipfile /home/[USER]/projects/scala_ts/out/artifacts/TimeSeriesFilter_jar/scala-ts.jar 2 " Select a file with cursor and press ENTER 3 4 META-INF/MANIFEST.MF 5 com/ 6 com/stronghold/ 7 com/stronghold/HelloWorld$.class 8 com/stronghold/TimeSeriesFilter$.class 9 com/stronghold/DataSource.class 10 com/stronghold/TimeSeriesFilter.class 11 com/stronghold/HelloWorld.class 12 com/stronghold/scratch.sc 13 com/stronghold/HelloWorld$delayedInit$body.class
Typically, the hang up here is on file structure, but I am pretty sure that's correct here:
../ scala_ts/ | .git/ | .idea/ | out/ | | artifacts/ | | | TimeSeriesFilter_jar/ | | | | scala-ts.jar | src/ | | main/ | | | scala/ | | | | com/ | | | | | stronghold/ | | | | | | DataSource.scala | | | | | | HelloWorld.scala | | | | | | TimeSeriesFilter.scala | | | | | | scratch.sc | | test/ | | | scala/ | | | | com/ | | | | | stronghold/ | | | | | | AppTest.scala | | | | | | MySpec.scala | target/ | README.md | pom.xml
I have run other jobs with the same structure at work (so, a different environment). I am now trying to gain some more facility with a home project, but this seems to be an early hang up.
In a nutshell, am I just missing something glaringly obvious?
APPENDIX
For those that are interested, here is my pom:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.stronghold</groupId> <artifactId>scala-ts</artifactId> <version>1.0-SNAPSHOT</version> <inceptionYear>2008</inceptionYear> <properties> <scala.version>2.11.8</scala.version> </properties> <repositories> <repository> <id>scala-tools.org</id> <name>Scala-Tools Maven2 Repository</name> <url>http://scala-tools.org/repo-releases</url> </repository> </repositories> <pluginRepositories> <pluginRepository> <id>scala-tools.org</id> <name>Scala-Tools Maven2 Repository</name> <url>http://scala-tools.org/repo-releases</url> </pluginRepository> </pluginRepositories> <dependencies> <dependency> <groupId>org.scala-lang</groupId> <artifactId>scala-library</artifactId> <version>2.11.8</version> </dependency> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>4.9</version> <scope>test</scope> </dependency> <dependency> <groupId>org.scala-tools.testing</groupId> <artifactId>specs_2.10</artifactId> <version>1.6.9</version> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.2.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.11</artifactId> <version>2.2.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-catalyst_2.11</artifactId> <version>2.2.0</version> </dependency> <dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.7.3</version> </dependency> </dependencies> <build> <sourceDirectory>src/main/scala</sourceDirectory> <testSourceDirectory>src/test/scala</testSourceDirectory> <plugins> <plugin> <groupId>org.scala-tools</groupId> <artifactId>maven-scala-plugin</artifactId> <executions> <execution> <goals> <goal>compile</goal> <goal>testCompile</goal> </goals> </execution> </executions> <configuration> <scalaVersion>${scala.version}</scalaVersion> <args> <arg>-target:jvm-1.5</arg> </args> </configuration> </plugin> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-eclipse-plugin</artifactId> <configuration> <downloadSources>true</downloadSources> <buildcommands> <buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand> </buildcommands> <additionalProjectnatures> <projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature> </additionalProjectnatures> <classpathContainers> <classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer> <classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer> </classpathContainers> </configuration> </plugin> </plugins> </build> <reporting> <plugins> <plugin> <groupId>org.scala-tools</groupId> <artifactId>maven-scala-plugin</artifactId> <configuration> <scalaVersion>${scala.version}</scalaVersion> </configuration> </plugin> </plugins> </reporting> </project>
UPDATE
Apologies for the lack of clarity. I ran the command from within the same directory as the
.jar
(/home/[USER]/projects/scala_ts/out/artifacts/TimeSeriesFilter_jar/
). That said, just to be clear, specifying the full path does not change the outcome.It should also be noted that I can run HelloWorld from within Intellij, and it uses the same class reference (
com.stronghold.HelloWorld
). -
Marvin Ward Jr almost 6 yearsWould you mind elaborating on why this is useful? I didn't have to use uber-jars in other contexts.
-
Marvin Ward Jr almost 6 yearsI cleaned and packaged the jar again, but I am afraid it didn't make a difference. As for referencing an old jar, I only created one for this project. Just to be safe, I deleted the jar and started from scratch by building a new one. Unfortunately, no dice.
-
Ramesh Maharjan almost 6 yearsdid you check the jar file name in target folder of the project?
-
Marvin Ward Jr almost 6 yearsSorry, I was swamped with work this week. I am just getting back to this. The answer is yes, the jar name is correct.
-
Ramesh Maharjan almost 6 yearswhat do you mean by correct? can you share the jar file name with the full path?
-
Marvin Ward Jr almost 6 yearsI mean there is no other jar. The only one I have built is here:
../scala_ts/out/artifacts/TimeSeriesFilter_jar/scala-ts.jar
. Also, there are no jars in the target folder, they sit in the out folder when built. -
Ramesh Maharjan almost 6 yearsI am talking about
../scala_ts/target/
. whats the jar name there? use that jar. thats what i meant in my answer -
Marvin Ward Jr almost 6 yearsIn general, in a variety of scala applications that have run (in my work environment), there is no information about jar name in the target folder.
-
Ramesh Maharjan almost 6 yearsI am looking at you pom file and your pom file suggests that when you package to make jar file it goes to target folder with name
scala-ts-1.0-SNAPSHOT.jar
with all the updates. Thats why I suggest you to use that one and try -
Marvin Ward Jr almost 6 yearsI hear you, but again, that file does not exist. There is no jar called
scala-ts-1.0-SNAPSHOT.jar
. There is only a jar calledscala-ts.jar
. You cannot access a jar that does not exist:Error: Unable to access jarfile scala-ts-1.0-SNAPSHOT.jar
. The issue isn't accessing the right jar, I think it has something to do with how the jar is built. -
Ramesh Maharjan almost 6 yearsyeah I guess so too. are you building your jar using artifact or maven?
-
Marvin Ward Jr almost 6 yearsI am indeed. I added an artifact to the project, and built the jar using that artifact configuration.
-
Ramesh Maharjan almost 6 yearsLet us continue this discussion in chat.