Translate

Sunday, February 10, 2013

Effective Logging in Java/JEE


What is MDC?
MDC stands for Mapped Diagnostic Context. It helps you to distinguish inter-leaving logs from multiple sources. Let me explain in detail. When we have multiple user-requests coming in for a given servlet, each request of an user is serviced using a thread. This leaves multiple users logging to the same log file and the log statements get inter-mixed. Now, to filter out logs of a particular user, we need to append the user-id to the log statements so that we can grep(search) them in the log file, to make some sense of it.
An obvious way of logging, is to append the user-id in the log statements i.e. log.info(userId+” logged something “);
A non-invasive way of logging is to use MDC. With MDC, you put the user-id in a context-map which is attached to the thread (of each user request) by the logger.
MDC is thread-safe and uses a Map internally to store the context information.[Courtesy : Kalyan Dabburi]
How to use MDC?
a. Configure the information, which needs to be logged (user-id in this case) in the log4j.xml as part of ConversionPattern.
log4j.appender.consoleAppender.layout.ConversionPattern
= %d %i - %m - %X{user-id}%n
b. In your respective class, before you start processing the user request, place the actual user-id in the context(MDC).
MDC.put("user-id","SKRS786");
c. Remove the context information from MDC at the end of the processing.
MDC.remove("user-id");
 
References :
 
What is NDC ? Which one to use MDC or NDC?
NDC stands for Nested Diagnostic Context. It is a stack-based implementation of attaching context information. For all purposes, use MDC over NDC, as MDC is memory efficient. For a detailed comparison, click here.

NDC vs MDC - Which one should I use?


The NDC and MDC log4j classes are used to store program/application contextual information that can then be used when logging messages. The NDC class name is org.apache.log4j.NDC. "NDC" stands for "Nested Diagnostic Context". The MDC class name is org.apache.log4j.MDC. "MDC" stands for "Mapped Diagnostic Context". NDC has been part of the log4j framework longer than MDC. If you haven't already, you may want to review the javadoc information for each class.

NDC


The "Nested Diagonostic Context" implements a "stack" onto which context information can be pushed and popped (ie "nested"). The context is stored per thread, so different threads can have different context information. When a program entered section "A" of its code, it could use NDC.push() to put the string "A" into the context. When it exited section "A", it would then NDC.pop() to remove "A" from the context. As you can see, you can continue to push/pop contexts. It is up to the application to make sure that the proper NDC.pop() call is made for each NDC.push().
When a message is logged, the current contents of the NDC are attached to it, and can be displayed in the log messages by using the '%x' option in PatternLayout. In this way, information specific to the context of a particular thread can be displayed in the log.
The beauty of this is that the logger sending the message does not have any clue about the context or contents of the NDC, and it doesn't need to. But appenders and filters can use the NDC information in the log message to affect the routing and display of log message. Besides the '%x' option in  PatternLayout , a new log4j filter for v1.3 (see org.apache.log4j.filters.NDCMatchFilter  in the current cvs) will accept or deny a log message based on the contents of the NDC information.

MDC


The "Mapped Diagnostic Context" implements a "map" into which key/value pair information can be stored. Just like NDC, the context is stored per thread. Values are stored by key name. Each thread could use the same key name but have different stored values. Values are stored/retreived/removed by using the familiar pattern of MDC.put(), MDC.get(), and MDC.remove() methods.
When a message is logged, the current contents of the MDC are attached to it, and can be displayed in the log messages by using the '%X' option in PatternLayout. More than one MDC value can be displayed in a single log message.
Just as with NDC, appenders and filters can use the MDC information attached to a log message for display and routing. Log4j v1.3 will contain a filter based on the contents of the MDC (see org.apache.log4j.filters.MDCMatchFilter  in the current cvs).

Which one to use?


Now that you have some idea of how the NDC and MDC store context information, it should be straight forward to choose which one to use. If nested/stack like information is important when logging information, use NDC. If key/value pair information is more appropriate, use MDC.

Known Gotchas


  • MDC requires JDK 1.2 or later. It is not compatible with JDK 1.1, unlike NDC which is.
  • NDC use can lead to memory leaks if you do not periodically call the NDC.remove() method. The current NDC implementation maintains a static hard link to the thread for which it is storing context. So, when the thread is released by its creator, the NDC maintains the link and the thread (and its related memory) is not released and garbage collected like one might expect. NDC.remove() fixes this by periodically checking the threads referenced by NDC and releasing the references of "dead" threads. But, you have to write your code to call NDC.remove().
So, give both NDC and MDC a try. Write some test code to set various values and log messages to see how the output changes. NDC and MDC are powerful tools for logging that no log4j user should be ignorant of.
Which logging framework to use? Log4J or SLF4J or logback?
For all new application development, use logbacklogback is a run-time implementation of SLF4J. If you have an existing application with Log4J, it is still worth-while to switch to logback. For a detailed explanation, click here.
Reasons to prefer logback over log4j
Logback brings a very large number of improvements over log4j, big and small. They are too many to enumerate exhaustively. Nevertheless, here is a non-exhaustive list of reasons for switching to logback from log4j. Keep in mind that logback is conceptually very similar to log4j as both projects were founded by the same developer. If you are already familiar with log4j, you will quickly feel at home using logback. If you like log4j, you will probably love logback.

Faster implementation

Based on our previous work on log4j, logback internals have been re-written to perform about ten times faster on certain critical execution paths. Not only are logback components faster, they have a smaller memory footprint as well.

Extensive battery of tests

Logback comes with a very extensive battery of tests developed over the course of several years and untold hours of work. While log4j is also tested, logback takes testing to a completely different level. In our opinion, this is the single most important reason to prefer logback over log4j. You want your logging framework to be rock solid and dependable even under adverse conditions.

logback-classic speaks SLF4J natively

Since the Logger class in logback-classic implements the SLF4J API natively, you incur zero overhead when invoking an SLF4J logger with logback-classic as the underlying implementation. Moreover, since logback-classic strongly encourages the use of SLF4J as its client API, if you need to switch to log4j or to j.u.l., you can do so by replacing one jar file with another. You will not need to touch your code logging via the SLF4J API. This can drastically reduce the work involved in switching logging frameworks.

Extensive documentation

Logback ships with detailed and constantly updated documentation.

Configuration files in XML or Groovy

The traditional way of configuring logback is via an XML file. Most of the examples in the documentation use this XML syntax. However, as of logback version 0.9.22, configuration files written in Groovy are also supported. Compared to XML, Groovy-style configuration is more intuitive, consistent and has a shorter syntax.

Automatic reloading of configuration files

Logback-classic can automatically reload its configuration file upon modification. The scanning process is fast, contention-free, and dynamically scales to millions of invocations per second spread over hundreds of threads. It also plays well within application servers and more generally within the JEE environment as it does not involve the creation of a separate thread for scanning.

Graceful recovery from I/O failures

Logback's FileAppender and all its sub-classes, including RollingFileAppender, can gracefully recover from I/O failures. Thus, if a file server fails temporarily, you no longer need to restart your application just to get logging working again. As soon as the file server comes back up, the relevant logback appender will transparently and quickly recover from the previous error condition.

Automatic removal of old log archives

By setting the maxHistory property of TimeBasedRollingPolicy or SizeAndTimeBasedFNATP, you can control the maximum number of archived files. If your rolling policy calls for monthly rollover and you wish to keep one year's worth of logs, simply set the maxHistory property to 12. Archived log files older than 12 months will be automatically removed.

Automatic compression of archived log files

RollingFileAppender can automatically compress archived log files during rollover. Compression always occurs asynchronously so that even for large log files, your application is not blocked for the duration of the compression.

Prudent mode

In prudent mode, multiple FileAppender instances running on multiple JVMs can safely write to the same log file. With certain limitations, prudent mode extends to RollingFileAppender.

Lilith

Lilith is a logging and access event viewer for logback. It is comparable to log4j's chainsaw, except that Lilith is designed to handle large amounts of logging data without flinching.

Conditional processing of configuration files

Developers often need to juggle between several logback configuration files targeting different environments such as development, testing and production. These configuration files have substantial parts in common, differing only in a few places. To avoid duplication, logback supports conditional processing of configuration files with the help of <if><then> and <else> elements so that a single configuration file can adequately target several environments.

Filters

Logback comes with a wide array of filtering capabilities going much further than what log4j has to offer. For example, let's assume that you have a business-critical application deployed on a production server. Given the large volume of transactions processed, logging level is set to WARN so that only warnings and errors are logged. Now imagine that you are confronted with a bug that can be reproduced on the production system but remains elusive on the test platform due to unspecified differences between those two environments (production/testing).
With log4j, your only choice is to lower the logging level to DEBUG on the production system in an attempt to identify the problem. Unfortunately, this will generate large volume of logging data, making analysis difficult. More importantly, extensive logging can impact the performance of your application on the production system.
With logback, you have the option of keeping logging at the WARN level for all users except for the one user, say Alice, who is responsible for identifying the problem. When Alice is logged on, she will be logging at level DEBUG while other users can continue to log at the WARN level. This feat can be accomplished by adding 4 lines of XML to your configuration file. Search for MDCFilter in the relevant section of the manual.

SiftingAppender

SiftingAppender is an amazingly versatile appender. It can be used to separate (or sift) logging according to any given runtime attribute. For example, SiftingAppender can separate logging events according to user sessions, so that the logs generated by each user go into distinct log files, one log file per user.

Stack traces with packaging data

When logback prints an exception, the stack trace will include packaging data. Here is a sample stack trace generated by the logback-demo web-application.
14:28:48.835 [btpool0-7] INFO  c.q.l.demo.prime.PrimeAction - 99 is not a valid value
java.lang.Exception: 99 is invalid
  at ch.qos.logback.demo.prime.PrimeAction.execute(PrimeAction.java:28) [classes/:na]
  at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:431) [struts-1.2.9.jar:1.2.9]
  at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:236) [struts-1.2.9.jar:1.2.9]
  at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:432) [struts-1.2.9.jar:1.2.9]
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) [servlet-api-2.5-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) [jetty-6.1.12.jar:6.1.12]
  at ch.qos.logback.demo.UserServletFilter.doFilter(UserServletFilter.java:44) [classes/:na]
  at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115) [jetty-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361) [jetty-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) [jetty-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) [jetty-6.1.12.jar:6.1.12]
From the above, you can recognize that the application is using Struts version 1.2.9 and was deployed under jetty version 6.1.12. Thus, stack traces will quickly inform the reader about the classes intervening in the exception but also the package and package versions they belong to. When your customers send you a stack trace, as a developer you will no longer need to ask them to send you information about the versions of packages they are using. The information will be part of the stack trace. See "%xThrowable" conversion word for details.
This feature can be quite helpful to the point that some users mistakenly consider it a feature of their IDE.

Logback-access, i.e. HTTP-access logging with brains, is an integral part of logback

Last but not least, the logback-access module, part of the logback distribution, integrates with Servlet containers such as Jetty or Tomcat to provide rich and powerful HTTP-access log functionality. Since logback-access was part of the initial design, all the logback-classic features you love are available in logback-access as well.

In summary

We have listed a number of reasons for preferring logback over log4j. Given that logback builds upon on our previous work on log4j, simply put, logback is just a better log4j.
To understand the evolution of logging in Java and JEE world, refer to this article by Micheal Andrews.