Best practices for using Markers in SLF4J/Logback

java logging slf4j logback markers

68,521

Solution 1

First, as @darioo said:

MDC is used for associating multiple events with few "entities"
[Markers] are used for "special" events that you want to have filtered from usual ones

So your assertion that You want to use MDC for this. Markers are for highlighting "special" events--filtering, if you will--rather than "slicing". For example, you might slice based on a particular user, but filter based on any unexpected exceptions. In this case, you would create a User MDC dimension and an UnexpectedException Marker.

But this apparently doesn't address the question you had in mind. You are "rather referring to the more general level of how would one set up logging around using markers consistently." So let's address that:

MDC is for slicing and dicing, and Markers are for filtering. These activities are carried out during testing and in production. As such, you need to decide which dimensions you expect may be useful to slice the log data by, and which cases it might be useful to filter it against, when testing/production comes around. Each dimension gets an MDC dimension. Each case gets a Marker. It's as simple as that.

The developers don't need to make any decisions here. A single person or team should decide, at design time, what sort of slicing, dicing and filtering needs to be supported. This should be informed by imagining what sort of analysis tasks one expects they may be asked to perform.

This same person or team should decide on the naming convention. It's entirely arbitrary. Choose something that's aesthetically pleasing, self-descriptive (most important), and specific enough to be unlikely to conflict with later additions. Hyphens vs. underscores is exceedingly nitpicky and alarmingly beside the point, but note it may be less confusing for ESL employees to read underscores (at least compared to CamelCase); at the same time, this reportedly annoys some developers due to the awkwardness of reaching the requisite keys.

As far as deciding on a policy, this just means defining in which cases a given Marker or MDC dimension needs to be employed. Keep this tight (centralized, deliberate), but allow for feedback from developers if they feel the set of dimensions and markers are insufficient for the task at hand. Revise/add dimensions and/or attributes as appropriate.

Understand this policy will almost necessarily be project-specific. Not every project needs the same sort of logging analysis. Picture some nightmare scenarios. Then imagine how you would like to be able to analyze the logs in that scenario. You probably don't want to have to write a complicated script to try and track which message belongs to which context, and which state is which at which time, right? Encode whatever such information is necessary as dimensions and Markers, and save yourself some of the hassle if something does go wrong.

Solution 2

First, MDC.

MDC is really useful in an environment where you have one "entity" that is associated with some behaviour. A typical example: user interacting with a web application. So, let's say you have many users messing around with your web app. Using MDC, you can easily track them without too much hassle. Simplified example:

...[Sandy][abcd] clicked on "change profile"
...[Joe][1234] clicked on "weather reports"
...[Joe][1234] clicked on "Europe"
...[Sandy][abcd] clicked on "logout"
...[Joe][1234] clicked on "logout"
...[Sandy][efgh] logged in

Here, you're using MDC in two places: for username and for session ID. This way, you can easily grep one user's session to see everything they've been doing.

Second, markers.

Markers are usually used for "special" circumstances, such as sending an email to an administrator for some seriously critical errors. Not all errors always fall in the same category; some have to be dealt in an appropriate way.

Or, when a user quits from your service, it usually goes to an INFO log, but you can also use a marker for such instances, if you want events such as this one to go in a separate log file, so you can monitor it more easily for statistical gathering of users quitting.

Rule of thumb:

MDC is used for associating multiple events with few "entities"
markers are used for "special" events that you want to have filtered from usual ones

Solution 3

Markers can be used to color or mark a single log statement. What you do with these colors, i.e. markers, is entirely up to you. However, two patterns seem to be common (the first more common than the second) for marker usage.

Triggering: Some appender could be instructed to take an action in the presence of a certain marker. For example, SMTPAppender can be configured to send an email whenever a logging event is marked with the NOTIFY_ADMIN marker regardless of the log level. See marker-based triggering in the logback documentation. You may also combine log levels and markers for triggering.
Filtering: You could for example color/mark all your persistence related logs (in various and multiple class files) with the color "DB". You could then filter for "DB": disable logging except for log statements marked with DB. See the chapter on filters in the logback documentation for more information (search for MarkerFilter).

Solution 4

Just as an addendum, if you are using logstash and have json logging enabled, there's another potential use of Marker - for logging variables to associate with a specific log message. This is more consistent and easier to parse than including it in the message body. Very useful, if it suits your use-case.

See details here:

https://github.com/logstash/logstash-logback-encoder#loggingevent_custom_event

Solution 5

The advantage of MDCs is that they are informed in the thread: let's take the case of a method for which a log is sent at the end. Throughout the method and the call of the sub-methods, you can fill in the MDC with information collected during the program. When the log is launched at the end, it contains the MDC and all the information that we could have put there With the appropriate pattern, we can retrieve the information from the MDC.

On the other hand, the marker is directly associated with the log.

View more solutions

68,521

Author by

Roland Tepp

I'm a developer, (surprise - eh!) For the last 5 years or so I've been professionally involved in writing Java in and with Eclipse (using big E both - as an IDE and as a platform) From the past, I still harbour some softness for Python language and web (client side) technologies like HTML, CSS and Javascript. Occasionally dusting off my knowledge of these and keeping them from completely rusting away... In my more distant youth I've also dabbled in wide range of languages and technologies, starting from BASIC, followed up bu Pascal, bits and pieces of C/C++ (enough to shoot myself in a foot), just a tad bit of assembler (just enough to know that is not the kind of programming I would love to do for any extended period of time), various of dialects of SQL, perl and bunch of stuff that don't even measuer up to being mentioned...

Updated on August 28, 2020

Comments

Roland Tepp over 3 years

We are using SLF4J+Logback combination at our project for a while now and are quite happy with it, but our logging strategy is fairly simple, using straightforward class based loggers and no fancy stuff like MDC or Markers.

What I want to know is if anybody in the community actually uses these features and how they are used to improve logging/filtering.

I am specially interested in where, why and how would one use^[1] Markers for logging. They strike me as a pretty neat feature for adding semantic context into the logging - e.g. while a class may be handling multiple concerns, one may use task/concern specific markers to discriminate log statements.

What may be the best practices, conventions or strategies for creating and using markers in logging.

Update: I guess, what I am really after is not so much why to use markers, but rather the how part — is there some good practices of naming markers (e.g. using plain text with spaces or dash/underscore/punctuation delimited keyword style names), should there be some sort of pool of "standard names", naming stuff based on the business functions. The questions I can probably figure out for myself, but if I want to use these features systematically and introduce them to a team of developers, it makes sense to have some formalizeable set of guidelines around...

^[1] - By asking how to use markers I am not really asking how to use API (it is really quite straight forward) - I am rather referring to the more general level of how would one set up logging around using markers consistently
Roland Tepp over 13 years

This is a good answer, but it only covers one possible use case for using markers. The way I see this, logging framework features (like MDC and Markers) exist to provide more metadata for later slicing and dicing of the logs (like the admin email or separate logging cases you mentioned). What I guess, I was after was how exactly to create markers (should there be a "standard pool" of markers, are there some naming conventions to keep in mind, etc.)
darioo over 13 years

@Roland: I've added some examples, but I cannot add all examples since they are, by definition, limitless. If you understand the motivation and reason for markers, then using them is limited only by your imagination, and common sense.
Ceki over 11 years

Great answer. I'd argue that MDC which is a thread based data structure can also be used for filtering.
DerMike about 9 years

Great answer. But what is an ESL Employee ?
user359996 about 9 years

Thank you. English as a second language.
LarryW about 2 years

I'd add that the appropriate pattern is usually "%X" with no braces afterward, so you get the entire MDC and don't need to update the pattern layout every time some piece of code, perhaps in a library, adds or removes something to/from the MDC.