Need assistance with running the WordCount.java provided by Cloudera

18,398

Solution 1

So I added the input folder to HDFS using the following command

hadoop dfs -put /usr/lib/hadoop/conf input/

Solution 2

Your input and output files should be at hdfs. Atleast input should be at hdfs.

use the following command:

hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount hdfs:/input

hdfs:/output

To copy a file from your linux to hdfs use the following command:

hadoop dfs -copyFromLocal ~/Desktop/input hdfs:/

and check your file using :

hadoop dfs -ls hdfs:/

Hope this will help.

Solution 3

The error message says that this file does not exist: "hdfs://localhost/home/rushabh/Desktop/input".

Check that the file does exist at the location you've told it to use.

Check the hostname is correct. You are using "localhost" which most likely resolves to a loopback IP address; e.g. 127.0.0.1. That always means "this host" ... in the context of the machine that you are running the code on.

Solution 4

When I tried to run wordcount MapReduce code, I was getting error as:

ERROR security.UserGroupInformation: PriviledgedActionException as:hduser cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hduser/wordcount

I was trying to execute the wordcount MapReduce java code with input and output path as /user/hduser/wordcount and /user/hduser/wordcount-output. I just added 'fs.default.name' from core-site.xml before this path and it ran perfectly.

Solution 5

The error clearly states that your input path is local. Please specify the input path to something on HDFS rather than on local machine. My guess

hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount ~/Desktop/input
~/Desktop/output

needs to be changed to

hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount <hdfs-input-dir>
<hdfs-output-dir>

NOTE: To run MapReduce job, the input directory should be in HDFS, not local.

Hope this helps.

Share:
18,398
anonymous123
Author by

anonymous123

Updated on June 05, 2022

Comments

  • anonymous123
    anonymous123 about 2 years

    Hey guys so I am trying to run the WordCount.java example, provided by cloudera. I ran the command below and am getting the exception that I have put below the command. So do you have any suggestions on how to proceed. I have gone through all the steps provided by cloudera.

    Thanks in advance.

    hadoop jar ~/Desktop/wordcount.jar org.myorg.WordCount ~/Desktop/input
    ~/Desktop/output
    

    Error:

    ERROR security.UserGroupInformation: PriviledgedActionException
    as:root (auth:SIMPLE)
    cause:org.apache.hadoop.mapred.InvalidInputException: Input path does
    not exist: hdfs://localhost/home/rushabh/Desktop/input
    Exception in thread "main"
    org.apache.hadoop.mapred.InvalidInputException: Input path does not
    exist: hdfs://localhost/home/rushabh/Desktop/input
            at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:194)
            at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:205)
            at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:977)
            at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:969)
            at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
            at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
            at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:416)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
            at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
            at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:807)
            at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1248)
            at org.myorg.WordCount.main(WordCount.java:55)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:616)
            at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
    
  • Xiaokun
    Xiaokun over 8 years
    “Hadoop dfs” was deprecated and now it’s done purely with “hdfs dfs”.