View worker / executor logs in Spark UI since 1.0.0+
Solution 1
These answers document how to find them from command line or UI
Where are logs in Spark on YARN?
For UI, on an edge node
Look in /etc/hadoop/conf/yarn-site.xml for the yarn resource manager URI (yarn.resourcemanager.webapp.address
).
Or use command line:
yarn logs -applicationId [OPTIONS]
Solution 2
Depending on your configuration of YARN NodeManager log aggregation, the spark job logs are aggregated automatically. Runtime log is usually be found in following ways:
Spark Master Log
If you're running with yarn-cluster, go to YARN Scheduler web UI. You can find the Spark Master log there. Job description page "log' button gives the content.
With yarn-client, the driver runs in your spark-submit command. Then what you see is the driver log, if log4j.properties is configured to output in stderr or stdout.
Spark Executor Log
Search for "executorHostname" in driver logs. See comments for more detail.
samthebest
To make me answer a question I like to answer questions on Spark, Hadoop, Big Data and Scala. I'm pretty good at Bash, git and Linux, so I can sometimes answer these questions too. I've stopped checking my filters for new questions these days, so I'm probably not answering questions which I probably could. Therefore if you think I can help, especially with Spark and Scala, then rather than me give me email out, please comment on a similar question/answer of mine with a link. Furthermore cross-linking similar questions can be nice for general SO browsing and good for SEO. My favourite answers Round parenthesis are much much better than curly braces http://stackoverflow.com/a/27686566/1586965 Underscore evangelism and in depth explanation http://stackoverflow.com/a/25763401/1586965 Generalized memoization http://stackoverflow.com/a/19065888/1586965 Monad explained in basically 2 LOCs http://stackoverflow.com/a/20707480/1586965
Updated on June 18, 2022Comments
-
samthebest almost 2 years
In 0.9.0 to view worker logs it was simple, they where one click away from the spark ui home page.
Now (1.0.0+) I cannot find them. Furthermore the Spark UI stops working when my job crashes! This is annoying, what is the point of a debugging tool that only works when your application does not need debugging. According to http://apache-spark-user-list.1001560.n3.nabble.com/Viewing-web-UI-after-fact-td12023.html I need to find out what my master-url is, but I don't how to, spark doesn't spit out this information at startup, all it says is:
... -Dspark.master=\"yarn-client\" ...
and obviously
http://yarn-client:8080
doesn't work. Some sites talk about how now in YARN finding logs has been super obfuscated - rather than just being on the UI, you have to login to the boxes to find them. Surely this is a massive regression and there has to be a simpler way??How am I supposed to find out what the master URL is? How can I find my worker (now called executor) logs?
-
samthebest over 9 yearsPlease could you expand on "Search for "executorHostname" in driver logs.", suppose I find the hostnames for my executors, which I do know, how do I then view the logs???
-
suztomo over 9 yearsCheck the location : yarn.nodemanager.log-dirs: Determines where the container-logs are stored on the node when the containers are running. Default is ${yarn.log.dir}/userlogs. hortonworks.com/blog/…
-
samthebest over 9 yearsYes, I'm aware that I can ssh into each box, find the actual files and read them. I want to know how to read the logs in a web UI, just like I could in 0.9.0. It seems like a major regression to make me ssh into boxes to find logs.
-
suztomo over 9 yearsIf yarn.nodemanager.log.log-dirs is under yarn.log.dir, then you read the log via NomeManager's web UI in the same way as you read NodeManager's log.
-
samthebest over 9 yearsHow do I find the "NomeManager's web UI" URL? I guess I just have to ask my DevOps team what they have configured it too right? Or is there a self service way to find out given one can ssh into the box?
-
suztomo over 9 yearsYes > ask my DevOps team