cdh4 hadoop-hbase PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException

configuration hadoop mapreduce hbase cloudera

11,211

Solution 1

Debug procedure: Try running simple Hadoop shell commands.

hadoop fs -ls /

If this shows the HDFS files then your configuration is correct. If not, then the configuration is missing. When this happens hadoop shell command like -ls will refer to local filesystem and not Hadoop file system. This happens if Hadoop is started using CMS (Cloudera manager). It does not explicitly stores the configuration in conf directory.

Check if hadoop file system is displayed by following command (change port):

hadoop fs -ls hdfs://host:8020/

If it displays local file system when you submit the path as / then you should set the configuration files hdfs-site.xml and mapred-site.xml in configuration directory. Also hdfs-site.xml should have the entry for fs.default.name pointing to hdfs://host:port/. In my case the directory is /etc/hadoop/conf.

See: http://hadoop.apache.org/common/docs/r0.20.2/core-default.html

See, if this resolves your issue.

Solution 2

Even I phased the same problem in 2.0.0-cdh4.1.3 while running MR jobs. After adding the property in mapred.site.xml

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

For running Hive job

export HIVE_USER=yarn

11,211

Author by

Yogesh

Updated on June 04, 2022

Comments

Yogesh almost 2 years

I have installed cloudera cdh4 release And I am trying to run the mapreduce job on that. I am getting following error -->

2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.
2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session
2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry  after sleeping 2000 ms
2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000
2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian
2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001
2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar
    at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208)
    at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246)
    at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284)
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226)
    at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244)
    at

I am able to run sample programs given in hadoop-mapreduce-examples-2.0.0-cdh4.0.0.jar. But I am getting this error when my job is submitted successfully to jobtracker . Looks like it is trying to access local file-system again (Although I have set all the required libraries for job execution in distributed cache still its trying to access local dir). Is this issues related to user privileges ?

I) Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/ shows -

Found 8 items
drwxr-xr-x   - hbase hbase               0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/>
drwxrwxrwt   - hdfs  supergroup          0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/>
drwxr-xr-x   - hdfs  supergroup          0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/>

II) --- No Result for following ----

hdfs@Cloudera:/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
hdfs@Cloudera:/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/>

Instead I think with new APIS we are using ->
fs.defaultFS --> hdfs://Cloudera:8020 which i have set properly

Although for "fs.default.name" I got entries for hadoop cluster 0.20.2 (non-cloudera cluster)

cass-hadoop@Pratapgad:~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/>
./core-default.xml:  <name>fs.default.name</name><br/>
./core-site.xml:  <name>fs.default.name</name><br/>

I think cdh4 default configuration should add this entry in respective directory. (If its bug).

The command I am using to run my progrmme -

hdfs@Cloudera:/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName

Yogesh almost 12 years

Ashish please find the result of your questions added in parent question.
Yogesh almost 12 years

This is how I am creating a configuration object. Configuration. conf = new Configuration(false); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/core-site.xml")); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/hdfs-site.xml")); conf.addResource(new Path(ConfigReader.HADOOP_ROOT_DIRECTORY + "/conf/core-site.xml")); conf.addResource(new Path(ConfigReader.HADOOP_ROOT_DIRECTORY + "/conf/hdfs-site.xml")); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/hbase-site.xml")); <br/>Is it right approach ?<br/>
Yogesh almost 12 years

I think mappred settings(mapred-site.xml file is missing ) are not set properly . thats why by default its trying to run job on locally . either we need to configure Yarn or need to set configuration properly so that mrf1 jobs will run on jobtracker