cdh4 hadoop-hbase PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException
Solution 1
Debug procedure: Try running simple Hadoop shell commands.
hadoop fs -ls /
If this shows the HDFS files then your configuration is correct. If not, then the configuration is missing. When this happens hadoop shell command like -ls
will refer to local filesystem and not Hadoop file system.
This happens if Hadoop is started using CMS (Cloudera manager). It does not explicitly stores the configuration in conf
directory.
Check if hadoop file system is displayed by following command (change port):
hadoop fs -ls hdfs://host:8020/
If it displays local file system when you submit the path as /
then you should set the configuration files hdfs-site.xml
and mapred-site.xml
in configuration directory. Also hdfs-site.xml
should have the entry for fs.default.name
pointing to hdfs://host:port/
. In my case the directory is /etc/hadoop/conf
.
See: http://hadoop.apache.org/common/docs/r0.20.2/core-default.html
See, if this resolves your issue.
Solution 2
Even I phased the same problem in 2.0.0-cdh4.1.3 while running MR jobs. After adding the property in mapred.site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
For running Hive job
export HIVE_USER=yarn
Yogesh
Updated on June 04, 2022Comments
-
Yogesh almost 2 years
I have installed cloudera cdh4 release And I am trying to run the mapreduce job on that. I am getting following error -->
2012-07-09 15:41:16 ZooKeeperSaslClient [INFO] Client will not SASL-authenticate because the default JAAS configuration section 'Client' could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration. 2012-07-09 15:41:16 ClientCnxn [INFO] Socket connection established to Cloudera/192.168.0.102:2181, initiating session 2012-07-09 15:41:16 RecoverableZooKeeper [WARN] Possibly transient ZooKeeper exception: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master 2012-07-09 15:41:16 RetryCounter [INFO] The 1 times to retry after sleeping 2000 ms 2012-07-09 15:41:16 ClientCnxn [INFO] Session establishment complete on server Cloudera/192.168.0.102:2181, sessionid = 0x1386b0b44da000b, negotiated timeout = 60000 2012-07-09 15:41:18 TableOutputFormat [INFO] Created table instance for exact_custodian 2012-07-09 15:41:18 NativeCodeLoader [WARN] Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2012-07-09 15:41:18 JobSubmitter [WARN] Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2012-07-09 15:41:18 JobSubmitter [INFO] Cleaning up the staging area file:/tmp/hadoop-hdfs/mapred/staging/hdfs48876562/.staging/job_local_0001 2012-07-09 15:41:18 UserGroupInformation [ERROR] PriviledgedActionException as:hdfs (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/cloudera/yogesh/lib/hbase.jar at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:736) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:208) at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:71) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:246) at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:284) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:355) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1226) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1223) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1223) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1244) at
I am able to run sample programs given in hadoop-mapreduce-examples-2.0.0-cdh4.0.0.jar. But I am getting this error when my job is submitted successfully to jobtracker . Looks like it is trying to access local file-system again (Although I have set all the required libraries for job execution in distributed cache still its trying to access local dir). Is this issues related to user privileges ?
I)
Cloudera:~ # hadoop fs -ls hdfs://<MyClusterIP>:8020/
shows -Found 8 items drwxr-xr-x - hbase hbase 0 2012-07-04 17:58 hdfs://<MyClusterIP>:8020/hbase<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/input<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-05 16:21 hdfs://<MyClusterIP>:8020/output<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-06 16:03 hdfs:/<MyClusterIP>:8020/tools-lib<br/> drwxr-xr-x - hdfs supergroup 0 2012-06-26 14:02 hdfs://<MyClusterIP>:8020/test<br/> drwxrwxrwt - hdfs supergroup 0 2012-06-12 16:13 hdfs://<MyClusterIP>:8020/tmp<br/> drwxr-xr-x - hdfs supergroup 0 2012-07-06 15:58 hdfs://<MyClusterIP>:8020/user<br/>
II) --- No Result for following ----
hdfs@Cloudera:/etc/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/> hdfs@Cloudera:/etc/hbase/conf> find . -name '**' | xargs grep "default.name"<br/>
Instead I think with new APIS we are using ->
fs.defaultFS --> hdfs://Cloudera:8020 which i have set properlyAlthough for "fs.default.name" I got entries for hadoop cluster 0.20.2 (non-cloudera cluster)
cass-hadoop@Pratapgad:~/hadoop/conf> find . -name '**' | xargs grep "default.name"<br/> ./core-default.xml: <name>fs.default.name</name><br/> ./core-site.xml: <name>fs.default.name</name><br/>
I think cdh4 default configuration should add this entry in respective directory. (If its bug).
The command I am using to run my progrmme -
hdfs@Cloudera:/home/cloudera/yogesh/lib> java -classpath hbase-tools.jar:hbase.jar:slf4j-log4j12-1.6.1.jar:slf4j-api-1.6.1.jar:protobuf-java-2.4.0a.jar:hadoop-common-2.0.0-cdh4.0.0.jar:hadoop-hdfs-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-common-2.0.0-cdh4.0.0.jar:hadoop-mapreduce-client-core-2.0.0-cdh4.0.0.jar:log4j-1.2.16.jar:commons-logging-1.0.4.jar:commons-lang-2.5.jar:commons-lang3-3.1.jar:commons-cli-1.2.jar:commons-configuration-1.6.jar:guava-11.0.2.jar:google-collect-1.0-rc2.jar:google-collect-1.0-rc1.jar:hadoop-auth-2.0.0-cdh4.0.0.jar:hadoop-auth.jar:jackson.jar:avro-1.5.4.jar:hadoop-yarn-common-2.0.0-cdh4.0.0.jar:hadoop-yarn-api-2.0.0-cdh4.0.0.jar:hadoop-yarn-server-common-2.0.0-cdh4.0.0.jar:commons-httpclient-3.0.1.jar:commons-io-1.4.jar:zookeeper-3.3.2.jar:jdom.jar:joda-time-1.5.2.jar com.hbase.xyz.MyClassName
-
Yogesh almost 12 yearsAshish please find the result of your questions added in parent question.
-
Yogesh almost 12 yearsThis is how I am creating a configuration object. Configuration. conf = new Configuration(false); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/core-site.xml")); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/hdfs-site.xml")); conf.addResource(new Path(ConfigReader.HADOOP_ROOT_DIRECTORY + "/conf/core-site.xml")); conf.addResource(new Path(ConfigReader.HADOOP_ROOT_DIRECTORY + "/conf/hdfs-site.xml")); conf.addResource(new Path(ConfigReader.HBASE_ROOT_DIRECTORY + "/conf/hbase-site.xml")); <br/>Is it right approach ?<br/>
-
Yogesh almost 12 yearsI think mappred settings(mapred-site.xml file is missing ) are not set properly . thats why by default its trying to run job on locally . either we need to configure Yarn or need to set configuration properly so that mrf1 jobs will run on jobtracker