java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.reloadExistingConfigurations()V

13,852

Solution 1

The dependencies between hadoop and AWS JDK are very sensitive, and you should stick to using the correct versions that your hadoop dependency version was built with.

The first problem you need to solve is pick one version of Hadoop. I see you're mixing versions 2.8.3 and 2.8.0.

When I look at the dependency tree for org.apache.hadoop:hadoop-aws:2.8.0, I see that it is built against version 1.10.6 of the AWS SDK (same for hadoop-aws:2.8.3).

maven dependency tree

This is probably what's causing mismatches (you're mixing incompatible versions). So:

  • Choose the version of hadoop you want to use
  • Include hadoop-aws with the version compatible with your hadoop
  • Remove other dependencies, or only include them with versions matching the one compatible with your hadoop version.

Solution 2

In case anybody else is still stumbling on this error... it took me a while to find out, but check if your project has a dependency (direct or transitive) on the package org.apache.avro/avro-tools. It was brought into my code by a transitive dependency. Its problem is that it ships with a copy of org.apache.hadoop.conf.Configuration that is much older than all current versions of hadoop, so it may end up being the one picked up in the classpath.

In my scala project, I just had to exclude it with

 ExclusionRule("org.apache.avro","avro-tools")

and the error (finally!) disappear.

I am sure that the avro-tools coders had some good reason to include a copy of a file that belongs to another package (hadoop-common), I was really surprised to find it there and made me waste an entire day.

Share:
13,852
Omkar
Author by

Omkar

Updated on July 28, 2022

Comments

  • Omkar
    Omkar almost 2 years

    It looks like I am again stuck on the running a packaged spark app jar using spark submit. Following is my pom file:

    <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
        <parent>
            <artifactId>oneview-forecaster</artifactId>
            <groupId>com.dataxu.oneview.forecast</groupId>
            <version>1.0.0-SNAPSHOT</version>
        </parent>
        <modelVersion>4.0.0</modelVersion>
        <artifactId>forecaster</artifactId>
    
    <dependencies>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.module</groupId>
            <artifactId>jackson-module-scala_${scala.binary.version}</artifactId>
        </dependency>
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
            <!--<scope>provided</scope>-->
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-hive_2.11</artifactId>
            <version>2.2.0</version>
            <!--<scope>provided</scope>-->
        </dependency>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-aws</artifactId>
            <version>2.8.3</version>
            <!--<scope>provided</scope>-->
        </dependency>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-java-sdk</artifactId>
            <version>1.10.60</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/joda-time/joda-time -->
        <dependency>
            <groupId>joda-time</groupId>
            <artifactId>joda-time</artifactId>
            <version>2.9.9</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>2.8.0</version>
            <!--<scope>provided</scope>-->
        </dependency>
    </dependencies>
    
    <build>
        <sourceDirectory>src/main/scala</sourceDirectory>
        <testSourceDirectory>src/test/scala</testSourceDirectory>
        <plugins>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>${scala-maven-plugin.version}</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <artifactId>maven-assembly-plugin</artifactId>
                <configuration>
                    <archive>
                        <manifest>
                            <mainClass>com.dataxu.oneview.forecaster.App</mainClass>
                        </manifest>
                    </archive>
                    <descriptorRefs>
                        <descriptorRef>jar-with-dependencies</descriptorRef>
                    </descriptorRefs>
                </configuration>
                <executions>
                    <execution>
                        <id>make-assembly</id>
                        <phase>package</phase>
                        <goals>
                            <goal>single</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>
    

    Following is a simple snippet of code which fetches data from s3 location and prints it:

    def getS3Data(path: String): Map[String, Any] = {
        println("spark session start.........")
        val spark =  getSparkSession()
    
        val configTxt = spark.sparkContext.textFile(path)
            .collect().reduce(_ + _)
    
        val mapper = new ObjectMapper
        mapper.registerModule(DefaultScalaModule)
        mapper.readValue(configTxt, classOf[Map[String, String]])
    }
    

    When I run it from intellij, everything works fine. the log is clear and looks good. However, when I package it using mvn package and try to run it using spark submit, I end up getting the following error at the .collect.reduce(_ + _). Following is the error I encounter:

     "main" java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.reloadExistingConfigurations()V
    at org.apache.hadoop.fs.s3a.S3AFileSystem.addDeprecatedKeys(S3AFileSystem.java:181)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.<clinit>(S3AFileSystem.java:185)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    ...
    

    I am not understanding which dependency was not packaged or what might be the issue as I did set the versions correctly expecting the hadoop aws should have all of them.

    Any help will be appreciated.

  • Omkar
    Omkar about 6 years
    I am trying it, will get back asap
  • Omkar
    Omkar about 6 years
    I did remove the hadoop-common dependency and changed the hadoop version to 2.8.0 and aws-java-sdk to 1.10.6. I am getting another error which I am investigating: Exception in thread "main" java.lang.IllegalAccessError: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.<init>(Lor‌​g/apache/hadoop/metr‌​ics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation
  • ernest_k
    ernest_k about 6 years
    Can you update the question with that info as well as with the updated version of your pom file?
  • stevel
    stevel about 6 years
    It's the hadoop-aws and hadoop-core libs which aren't in sync; they both need to match to 2.8.0, 2.8.3, whatever. Same for jackson and spark itself versions. The work well, but only if you use the exact same numbers