How can i debug Hadoop map reduce
Since you are processing big data, the size of your tracing messages can be huge, so it can cause a problem. It's useful to consider alternatives to "system.out.println" style logging:
- use Counters (here is an simple example)
- write logs to HDFS using MultipleOutputs
The best thing about Counters and MultipleOutputs - you can programmably access them, in case of MultipleOutputs you can even run map/reduce task to extract some statistics from logs.
An another alternative to debugging on production environment is unit-testing, MiniMRCluster will help you to test your map-reduce jobs during unit testing.
Gabriel H
Head of a Software development department. I Work on independent projects at home as a hobby - Mainly java and android.
Updated on September 20, 2020Comments
-
Gabriel H over 3 years
im trying to build a map reduce job.
it runs to completion but present weird data at the end.
when i try to debug it using system.out.println("debug data") it doesnt show on screen.
using the java API to produce an external log file, trying to print to the screen using log.severe("log data") or using log4j logger method log.info(log data) wont work either/
nothing works the only time i see my debug messages is when there is an exception in the map reduce job.
how can it be fixed so i can see my debug messages either on a file or on the screen?