Hadoop datanode fails to start throwing org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage

17,041

Solution 1

  • stop datanode
  • remove the in_use.lock file from the dfs data dir
  • location and start datanode

it should work just fine

Solution 2

Also add the following 2 properties in your hdfs-site.xml file..

<property>
        <name>dfs.name.dir</name>
        <value>/some_path</value>
    </property>

    <property>
        <name>dfs.data.dir</name>
        <value>/some_path</value>
    </property>

their default location is /tmp..because of this you lose data on each restart.

Solution 3

I was facing a similar issue, but I read a post that said the dfs.name.dir and dfs.data.dir should be different from each other. I had the two to be the same, and changing these values to be different from each other fixed my issue.

Share:
17,041
gc5
Author by

gc5

Updated on June 05, 2022

Comments

  • gc5
    gc5 almost 2 years

    I have some problems trying to start a datanode in Hadoop, from the log I can see that datanode is started twice (partial log follows):

    2012-05-22 16:25:00,369 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting DataNode
    STARTUP_MSG:   host = master/192.168.0.1
    STARTUP_MSG:   args = []
    STARTUP_MSG:   version = 1.0.1
    STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012
    ************************************************************/
    2012-05-22 16:25:00,375 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
    /************************************************************
    STARTUP_MSG: Starting DataNode
    STARTUP_MSG:   host = master/192.168.0.1
    STARTUP_MSG:   args = []
    STARTUP_MSG:   version = 1.0.1
    STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012
    ************************************************************/
    2012-05-22 16:25:00,490 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
    2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
    2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
    2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
    2012-05-22 16:25:00,512 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
    2012-05-22 16:25:00,523 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
    2012-05-22 16:25:00,523 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
    2012-05-22 16:25:00,524 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
    2012-05-22 16:25:00,722 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
    2012-05-22 16:25:00,724 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
    2012-05-22 16:25:00,727 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
    2012-05-22 16:25:00,729 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
    2012-05-22 16:20:15,894 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /app/hadoop/tmp/dfs/data. The directory is already locked.
    2012-05-22 16:20:16,008 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /app/hadoop/tmp/dfs/data. The directory is already locked.
            at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:602)
            at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:455)
            at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:111)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:385)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
            at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)
    

    I've searched online and I found this question, but I didn't overwrite anything using conf/hdfs-site.xml, that is shown below, so Hadoop should use default values that (as described here) cannot cause any failed lock. This is my conf/hdfs-site.xml:

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>dfs.replication</name>
        <value>2</value>
        <description>Default block replication.
        The actual number of replications can be specified when the file is created.
        The default is used if replication is not specified in create time.
        </description>
      </property>
    </configuration>
    

    This is my conf/core-site.xml:

    <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
    
    <!-- Put site-specific property overrides in this file. -->
    
    <configuration>
      <property>
        <name>hadoop.tmp.dir</name>
        <value>/app/hadoop/tmp</value>
        <description>A base for other temporary directories.</description>
      </property>
    
      <property>
        <name>fs.default.name</name>
        <value>hdfs://master:54310</value>
        <description>The name of the default file system.  A URI whose
        scheme and authority determine the FileSystem implementation.  The
        uri's scheme determines the config property (fs.SCHEME.impl) naming
        the FileSystem implementation class.  The uri's authority is used to
        determine the host, port, etc. for a filesystem.</description>
      </property>
    </configuration>
    

    This is the content of hadoop/conf/slaves:

    master
    slave
    
    • Chris White
      Chris White almost 12 years
      Can you confirm that the user under which hadoop is running, has write permission to the /app/hadoop/tmp/dfs/data folder, and that this folder exists
    • gc5
      gc5 almost 12 years
      Yes, user is owner of /app/hadoop/tmp/dfs/data, has write permissions and that folder exists.
    • Chris White
      Chris White almost 12 years
      ok, does a file called in_use.lock already exist in that folder?
    • gc5
      gc5 almost 12 years
      No, it does not. Retrying to start with start-dfs.sh now it throws java.io.IOException: Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data that I am (hopefully) able to resolve using michael-noll.com/tutorials/…".. I also noticed that when datanode crashed, it had been started twice in 200ms. If necessary I'll post the full log.
    • Chris White
      Chris White almost 12 years
      well running two instances side-by-side would result in the lock error message
    • gc5
      gc5 almost 12 years
      I've updated the partial log with the part datanode is started twice
    • Chris White
      Chris White almost 12 years
      Can you post your $HADOOP_HOME/conf/slaves file too