cdh4 hadoop-hbase PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException

11,211

Solution 1

Debug procedure: Try running simple Hadoop shell commands.

hadoop fs -ls /

If this shows the HDFS files then your configuration is correct. If not, then the configuration is missing. When this happens hadoop shell command like -ls will refer to local filesystem and not Hadoop file system. This happens if Hadoop is started using CMS (Cloudera manager). It does not explicitly stores the configuration in conf directory.

Check if hadoop file system is displayed by following command (change port):

hadoop fs -ls hdfs://host:8020/

If it displays local file system when you submit the path as / then you should set the configuration files hdfs-site.xml and mapred-site.xml in configuration directory. Also hdfs-site.xml should have the entry for fs.default.name pointing to hdfs://host:port/. In my case the directory is /etc/hadoop/conf.

See: http://hadoop.apache.org/common/docs/r0.20.2/core-default.html

See, if this resolves your issue.

Solution 2

Even I phased the same problem in 2.0.0-cdh4.1.3 while running MR jobs. After adding the property in mapred.site.xml

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

For running Hive job

export HIVE_USER=yarn
Share:
11,211
Yogesh
Author by

Yogesh

Updated on June 04, 2022

Comments

  • Yogesh
    Yogesh almost 2 years

    I have installed cloudera cdh4 release And I am trying to run the mapreduce job on that. I am getting following error -->

    2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
    2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session
    2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
    2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry  after sleeping 2000 ms
    2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000
    2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian
    2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001
    2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
    Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
        at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
        at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
        at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
        at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
        at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
        at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
        at 
    

    I am able to run sample programs given in hadoop-mapreduce-examples-2.0.0-cdh4.0.0.jar. But I am getting this error when my job is submitted successfully to jobtracker . Looks like it is trying to access local file-system again (Although I have set all the required libraries for job execution in distributed cache still its trying to access local dir). Is this issues related to user privileges ?

    I) Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/ shows -

    Found 8 items
    drwxr-xr-x   - hbase hbase               0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/>
    drwxr-xr-x   - hdfs  supergroup          0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/>
    drwxr-xr-x   - hdfs  supergroup          0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/>
    drwxr-xr-x   - hdfs  supergroup          0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/>
    drwxr-xr-x   - hdfs  supergroup          0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/>
    drwxrwxrwt   - hdfs  supergroup          0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/>
    drwxr-xr-x   - hdfs  supergroup          0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/>
    

    II) --- No Result for following ----

    hdfs@Cloudera:/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
    hdfs@Cloudera:/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/>
    

    Instead I think with new APIS we are using ->
    fs.defaultFS --> hdfs://Cloudera:8020 which i have set properly

    Although for "fs.default.name" I got entries for hadoop cluster 0.20.2 (non-cloudera cluster)

    cass-hadoop@Pratapgad:~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
    ./core-default.xml:  <name>fs.default.name</name><br/>
    ./core-site.xml:  <name>fs.default.name</name><br/>
    

    I think cdh4 default configuration should add this entry in respective directory. (If its bug).

    The command I am using to run my progrmme -

    hdfs@Cloudera:/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName

  • Yogesh
    Yogesh almost 12 years
    Ashish please find the result of your questions added in parent question.
  • Yogesh
    Yogesh almost 12 years
    This is how I am creating a configuration object. Configuration. conf = new Configuration(false); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/core-site.xml")); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/hdfs-site.xml")); conf.addResource(new Path(ConfigReader.HADOOP_ROOT_DIRECTORY + "/conf/core-site.xml")); conf.addResource(new Path(ConfigReader.HADOOP_ROOT_DIRECTORY + "/conf/hdfs-site.xml")); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/hbase-site.xml")); <br/>Is it right approach ?<br/>
  • Yogesh
    Yogesh almost 12 years
    I think mappred settings(mapred-site.xml file is missing ) are not set properly . thats why by default its trying to run job on locally . either we need to configure Yarn or need to set configuration properly so that mrf1 jobs will run on jobtracker