Why does my yarn application not have logs even with logging enabled?

20,134

Solution 1

yarn application -list

will list only the applications that are either in SUBMITTED, ACCEPTED or RUNNING state.

Log aggregation collects each container's logs and moves these logs onto the directory configured in yarn.nodemanager.remote-app-log-dir only after the completion of the application. Refer the description of yarn.log-aggregation-enable property here.

So, the applicationId listed by the command isn't completed yet and the logs are not yet collected. Thus the response when trying to access the logs of a running application

hdfs://<ip address>/var/log/hadoop-yarn/path/to/application/  does not have any log files

You can try the same command yarn logs -applicationId <application ID> to view the logs once the application has completed.

To list all the FINISHED applications, use

yarn application -list -appStates FINISHED

Or to list all the applications

yarn application -list -appStates ALL

Solution 2

In version 2.3.2 of hadoop and higher you can get log aggregation to occur hourly on running jobs using this configuration in yarn-site.xml:

<property>
    <name>yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds</name>
    <value>3600</value>
</property>

See this for further details: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_yarn_resource_mgt/content/ref-375ff479-e530-46d8-9f96-8b52dadb5183.1.html

Solution 3

Enable Log Aggregation

Log aggregation is enabled in the yarn-site.xml file. The yarn.log-aggregation-enable property enables log aggregation for running applications.

<property>
 <name>yarn.log-aggregation-enable</name>
 <value>true</value>
</property>

Solution 4

It was probably saved with another appOwner. You can try to specify the application owner in your command:

yarn logs -appOwner .. -application_id ..

Share:
20,134
makansij
Author by

makansij

I'm a PhD Student at University of Southern California.

Updated on July 09, 2022

Comments

  • makansij
    makansij almost 2 years

    I have enabled logs in the xml file: yarn-site.xml, and I restarted yarn by doing:

    sudo service hadoop-yarn-resourcemanager restart
    sudo service hadoop-yarn-nodemanager restart
    

    I ran my application, and then I see the applicationID in yarn application -list. So, I do this: yarn logs -applicationId <application ID>, and I get the following:

    hdfs://<ip address>/var/log/hadoop-yarn/path/to/application/  does not have any log files
    

    Do I need to change some other configuration? Or am I accessing the logs the wrong way?

    Thank you.

  • makansij
    makansij about 7 years
    The above is exactly what I have in my yarn-site.xml file. What more can I do?
  • makansij
    makansij about 7 years
    What more can I do? @AniMenon
  • Ani Menon
    Ani Menon about 7 years
    I am not sure what else could be the problem. Just goto the YARN Resource Manager UI and check if your job is there in the list of all jobs.
  • Kanagaraj Dhanapal
    Kanagaraj Dhanapal about 7 years
    hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-com‌​mon/…, please refer this link on "Configurations for NodeManager"
  • makansij
    makansij about 7 years
    huh interesting. that's nice
  • Averell
    Averell over 5 years
    I have this parameter configured already, but still no logs for running jobs.
  • JMess
    JMess almost 5 years
    Can you comment on how to view the logs while the application is still in one of the pre-aggregation phases? Also what if the job has automatic retries, how can we differentiate between runs?
  • franklinsijo
    franklinsijo almost 5 years
    You should be able to see the running container logs in the Application Master UI.
  • JMess
    JMess almost 5 years
    Thank you franklinsijo. In the case that the logs are too large for a browser to display them I am going to the node of the container and then looking in $HADOOP_HOME/logs. Also in regard to my earlier question, container ids will be different between retries
  • franklinsijo
    franklinsijo almost 5 years
    For each task, you should be able to fetch the logs based on their attempt ids.