Hadoop Error - All data nodes are aborting

14,489

You seem to be hitting the open file handles limit of your user. This is a pretty common issue, and can be cleared in most cases by increasing the ulimit values (its mostly 1024 by default, easily exhaustible by multi-out jobs like yours).

You can follow this short guide to increase it: http://blog.cloudera.com/blog/2009/03/configuration-parameters-what-can-you-just-ignore/ [The section "File descriptor limits"]

Answered by Harsh J - https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/kJRUkVxmfhw

Share:
14,489
Sravan Kumar
Author by

Sravan Kumar

Updated on June 24, 2022

Comments

  • Sravan Kumar
    Sravan Kumar almost 2 years

    I am using Hadoop 2.3.0 version. Sometimes when I execute the Map reduce job, the below errors will get displayed.

    14/08/10 12:14:59 INFO mapreduce.Job: Task Id : attempt_1407694955806_0002_m_000780_0, Status : FAILED
    Error: java.io.IOException: All datanodes 192.168.30.2:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1023)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:838)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:483)
    


    When I try to check the log files for these failed tasks, the log folder for this task will be empty.

    I am not able to understand the reason behind this error. Could someone please let me know how to resolve this issue. Thanks for your help.