There are 0 datanode(s) running and no node(s) are excluded in this operation

ubuntu hadoop amazon-ec2 hdfs hadoop2

78,212

Solution 1

Two things worked for me,

STEP 1 : stop hadoop and clean temp files from hduser

sudo rm -R /tmp/*

also, you may need to delete and recreate /app/hadoop/tmp (mostly when I change hadoop version from 2.2.0 to 2.7.0)

sudo rm -r /app/hadoop/tmp
sudo mkdir -p /app/hadoop/tmp
sudo chown hduser:hadoop /app/hadoop/tmp
sudo chmod 750 /app/hadoop/tmp

STEP 2: format namenode

hdfs namenode -format

Now, I can see DataNode

hduser@prayagupd:~$ jps
19135 NameNode
20497 Jps
19477 DataNode
20447 NodeManager
19902 SecondaryNameNode
20106 ResourceManager

Solution 2

I had the same problem after improper shutdown of the node. Also checked in the UI the datanode is not listed.

Now it's working after deleting the files from datanode folder and restarting services.

stop-all.sh

rm -rf /usr/local/hadoop_store/hdfs/datanode/*

start-all.sh

Solution 3

@Learner,
I had this problem of datanodes not shown in the Namenode's web UI. Solved it by these steps in Hadoop 2.4.1.

do this for all the nodes (master and slaves)

1. remove all temporary files ( by default in /tmp) - sudo rm -R /tmp/*.
2. Now try connecting to all nodes through ssh by using ssh username@host and add keys in your master using ssh-copy-id -i ~/.ssh/id_rsa.pub username@host to give unrestricted access of slaves to the master (not doing so might be the problem for refusing connections).
3. Format the namenode using hadoop namenode -format and try restarting the daemons.

Solution 4

On my situation, firewalld service was running. It was on default configuration. And it don't allow the communication between nodes. My hadoop cluster was a test cluster. Because of this, I stopped the service. If your servers are in production, you should allow hadoop ports on firewalld, instead of

service firewalld stop
chkconfig firewalld off

Solution 5

I had same error. I had not permission to hdfs file system. So I give permission to my user:

chmod 777 /usr/local/hadoop_store/hdfs/namenode
chmod 777 /usr/local/hadoop_store/hdfs/datanode

View more solutions

78,212

Author by

Learner

Updated on April 14, 2020

Comments

Learner about 4 years

I have set up a multi node Hadoop Cluster. The NameNode and Secondary namenode runs on the same machine and the cluster has only one Datanode. All the nodes are configured on Amazon EC2 machines.

Following are the configuration files on the master node:

masters
54.68.218.192 (public IP of the master node)

slaves
54.68.169.62 (public IP of the slave node)

core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

Now are the configuration files on the datanode:

core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://54.68.218.192:10001</value>
</property>
</configuration>

mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>54.68.218.192:10002</value>
</property>
</configuration>

hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>

the jps run on the Namenode give the following:

5696 NameNode
6504 Jps
5905 SecondaryNameNode
6040 ResourceManager

and jps on datanode:

2883 DataNode
3496 Jps
3381 NodeManager

which to me seems right.

Now when I try to run a put command:

hadoop fs -put count_inputfile /test/input/

It gives me the following error:

put: File /count_inputfile._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1).  There are 0 datanode(s) running and no node(s) are excluded in this operation.

The logs on the datanode says the following:

hadoop-datanode log
INFO org.apache.hadoop.ipc.Client: Retrying connect to server:      54.68.218.192/54.68.218.192:10001. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

yarn-nodemanager log:

INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

The web UI of node manager(50070) shows that there are 0 live nodes and 0 dead nodes and the dfs used is 100%

I have also disabled IPV6.

On a few websites I found out that I should also edit the /etc/hosts file. I have also edited them and they look like this:

127.0.0.1 localhost
172.31.25.151 ip-172-31-25-151.us-west-2.compute.internal
172.31.25.152 ip-172-31-25-152.us-west-2.compute.internal

Why I am still geting the error?

Recents

Why Is PNG file with Drop Shadow in Flutter Web App Grainy?

How to troubleshoot crashes detected by Google Play Store for Flutter app

Cupertino DateTime picker interfering with scroll behaviour

Why does awk -F work for most letters, but not for the letter "t"?

Flutter change focus color and icon color but not works

How to print and connect to printer using flutter desktop via usb?

Critical issues have been reported with the following SDK versions: com.google.android.gms:play-services-safetynet:17.0.0

Flutter Dart - get localized country name from country code

navigatorState is null when using pushNamed Navigation onGenerateRoutes of GetMaterialPage

Android Sdk manager not found- Flutter doctor error

Flutter Laravel Push Notification without using any third party like(firebase,onesignal..etc)

How to change the color of ElevatedButton when entering text in TextField

Merging small files in hadoop

Hadoop Error - All data nodes are aborting

Is there the equivalent for a `find` command in `hadoop`?

could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation

mount.nfs: mount system call failed

namenode, datanode not list by using jps

HDFS error put: `input': No such file or directory

How to update a file in HDFS

hadoop connection refused on port 9000

name node Vs secondary name node