Showing posts with label Logging. Show all posts

Sunday, February 10, 2013

Effective Logging in Java/JEE

What is MDC?

MDC stands for Mapped Diagnostic Context. It helps you to distinguish inter-leaving logs from multiple sources. Let me explain in detail. When we have multiple user-requests coming in for a given servlet, each request of an user is serviced using a thread. This leaves multiple users logging to the same log file and the log statements get inter-mixed. Now, to filter out logs of a particular user, we need to append the user-id to the log statements so that we can grep(search) them in the log file, to make some sense of it.

An obvious way of logging, is to append the user-id in the log statements i.e. log.info(userId+” logged something “);

A non-invasive way of logging is to use MDC. With MDC, you put the user-id in a context-map which is attached to the thread (of each user request) by the logger.

MDC is thread-safe and uses a Map internally to store the context information.[Courtesy : Kalyan Dabburi]

How to use MDC?

a. Configure the information, which needs to be logged (user-id in this case) in the log4j.xml as part of ConversionPattern.

log4j.appender.consoleAppender.layout.ConversionPattern
= %d %i - %m - %X{user-id}%n

b. In your respective class, before you start processing the user request, place the actual user-id in the context(MDC).

MDC.put("user-id","SKRS786");

c. Remove the context information from MDC at the end of the processing.

MDC.remove("user-id");

References :

Logback implementation

Log4J implementation

What is NDC ? Which one to use MDC or NDC?

NDC stands for Nested Diagnostic Context. It is a stack-based implementation of attaching context information. For all purposes, use MDC over NDC, as MDC is memory efficient. For a detailed comparison, click here.

NDC vs MDC - Which one should I use?

The NDC and MDC log4j classes are used to store program/application contextual information that can then be used when logging messages. The NDC class name is org.apache.log4j.NDC. "NDC" stands for "Nested Diagnostic Context". The MDC class name is org.apache.log4j.MDC. "MDC" stands for "Mapped Diagnostic Context". NDC has been part of the log4j framework longer than MDC. If you haven't already, you may want to review the javadoc information for each class.

NDC

The "Nested Diagonostic Context" implements a "stack" onto which context information can be pushed and popped (ie "nested"). The context is stored per thread, so different threads can have different context information. When a program entered section "A" of its code, it could use NDC.push() to put the string "A" into the context. When it exited section "A", it would then NDC.pop() to remove "A" from the context. As you can see, you can continue to push/pop contexts. It is up to the application to make sure that the proper NDC.pop() call is made for each NDC.push().

When a message is logged, the current contents of the NDC are attached to it, and can be displayed in the log messages by using the '%x' option in PatternLayout. In this way, information specific to the context of a particular thread can be displayed in the log.

The beauty of this is that the logger sending the message does not have any clue about the context or contents of the NDC, and it doesn't need to. But appenders and filters can use the NDC information in the log message to affect the routing and display of log message. Besides the '%x' option in PatternLayout , a new log4j filter for v1.3 (see org.apache.log4j.filters.NDCMatchFilter in the current cvs) will accept or deny a log message based on the contents of the NDC information.

MDC

The "Mapped Diagnostic Context" implements a "map" into which key/value pair information can be stored. Just like NDC, the context is stored per thread. Values are stored by key name. Each thread could use the same key name but have different stored values. Values are stored/retreived/removed by using the familiar pattern of MDC.put(), MDC.get(), and MDC.remove() methods.

When a message is logged, the current contents of the MDC are attached to it, and can be displayed in the log messages by using the '%X' option in PatternLayout. More than one MDC value can be displayed in a single log message.

Just as with NDC, appenders and filters can use the MDC information attached to a log message for display and routing. Log4j v1.3 will contain a filter based on the contents of the MDC (see org.apache.log4j.filters.MDCMatchFilter in the current cvs).

Which one to use?

Now that you have some idea of how the NDC and MDC store context information, it should be straight forward to choose which one to use. If nested/stack like information is important when logging information, use NDC. If key/value pair information is more appropriate, use MDC.

Known Gotchas

MDC requires JDK 1.2 or later. It is not compatible with JDK 1.1, unlike NDC which is.
NDC use can lead to memory leaks if you do not periodically call the NDC.remove() method. The current NDC implementation maintains a static hard link to the thread for which it is storing context. So, when the thread is released by its creator, the NDC maintains the link and the thread (and its related memory) is not released and garbage collected like one might expect. NDC.remove() fixes this by periodically checking the threads referenced by NDC and releasing the references of "dead" threads. But, you have to write your code to call NDC.remove().

So, give both NDC and MDC a try. Write some test code to set various values and log messages to see how the output changes. NDC and MDC are powerful tools for logging that no log4j user should be ignorant of.

Which logging framework to use? Log4J or SLF4J or logback?

For all new application development, use logback. logback is a run-time implementation of SLF4J. If you have an existing application with Log4J, it is still worth-while to switch to logback. For a detailed explanation, click here.

Reasons to prefer logback over log4j

Logback brings a very large number of improvements over log4j, big and small. They are too many to enumerate exhaustively. Nevertheless, here is a non-exhaustive list of reasons for switching to logback from log4j. Keep in mind that logback is conceptually very similar to log4j as both projects were founded by the same developer. If you are already familiar with log4j, you will quickly feel at home using logback. If you like log4j, you will probably love logback.

Faster implementation

Based on our previous work on log4j, logback internals have been re-written to perform about ten times faster on certain critical execution paths. Not only are logback components faster, they have a smaller memory footprint as well.

Extensive battery of tests

Logback comes with a very extensive battery of tests developed over the course of several years and untold hours of work. While log4j is also tested, logback takes testing to a completely different level. In our opinion, this is the single most important reason to prefer logback over log4j. You want your logging framework to be rock solid and dependable even under adverse conditions.

logback-classic speaks SLF4J natively

Since the Logger class in logback-classic implements the SLF4J API natively, you incur zero overhead when invoking an SLF4J logger with logback-classic as the underlying implementation. Moreover, since logback-classic strongly encourages the use of SLF4J as its client API, if you need to switch to log4j or to j.u.l., you can do so by replacing one jar file with another. You will not need to touch your code logging via the SLF4J API. This can drastically reduce the work involved in switching logging frameworks.

Extensive documentation

Logback ships with detailed and constantly updated documentation.

Configuration files in XML or Groovy

The traditional way of configuring logback is via an XML file. Most of the examples in the documentation use this XML syntax. However, as of logback version 0.9.22, configuration files written in Groovy are also supported. Compared to XML, Groovy-style configuration is more intuitive, consistent and has a shorter syntax.

There is also a tool to automatically migrate your logback.xml files to logback.groovy.

Automatic reloading of configuration files

Logback-classic can automatically reload its configuration file upon modification. The scanning process is fast, contention-free, and dynamically scales to millions of invocations per second spread over hundreds of threads. It also plays well within application servers and more generally within the JEE environment as it does not involve the creation of a separate thread for scanning.

Graceful recovery from I/O failures

Logback's FileAppender and all its sub-classes, including RollingFileAppender, can gracefully recover from I/O failures. Thus, if a file server fails temporarily, you no longer need to restart your application just to get logging working again. As soon as the file server comes back up, the relevant logback appender will transparently and quickly recover from the previous error condition.

Automatic removal of old log archives

By setting the maxHistory property of TimeBasedRollingPolicy or SizeAndTimeBasedFNATP, you can control the maximum number of archived files. If your rolling policy calls for monthly rollover and you wish to keep one year's worth of logs, simply set the maxHistory property to 12. Archived log files older than 12 months will be automatically removed.

Automatic compression of archived log files

RollingFileAppender can automatically compress archived log files during rollover. Compression always occurs asynchronously so that even for large log files, your application is not blocked for the duration of the compression.

Prudent mode

In prudent mode, multiple FileAppender instances running on multiple JVMs can safely write to the same log file. With certain limitations, prudent mode extends to RollingFileAppender.

Lilith

Lilith is a logging and access event viewer for logback. It is comparable to log4j's chainsaw, except that Lilith is designed to handle large amounts of logging data without flinching.

Conditional processing of configuration files

Developers often need to juggle between several logback configuration files targeting different environments such as development, testing and production. These configuration files have substantial parts in common, differing only in a few places. To avoid duplication, logback supports conditional processing of configuration files with the help of <if>, <then> and <else> elements so that a single configuration file can adequately target several environments.

Filters

Logback comes with a wide array of filtering capabilities going much further than what log4j has to offer. For example, let's assume that you have a business-critical application deployed on a production server. Given the large volume of transactions processed, logging level is set to WARN so that only warnings and errors are logged. Now imagine that you are confronted with a bug that can be reproduced on the production system but remains elusive on the test platform due to unspecified differences between those two environments (production/testing).

With log4j, your only choice is to lower the logging level to DEBUG on the production system in an attempt to identify the problem. Unfortunately, this will generate large volume of logging data, making analysis difficult. More importantly, extensive logging can impact the performance of your application on the production system.

With logback, you have the option of keeping logging at the WARN level for all users except for the one user, say Alice, who is responsible for identifying the problem. When Alice is logged on, she will be logging at level DEBUG while other users can continue to log at the WARN level. This feat can be accomplished by adding 4 lines of XML to your configuration file. Search for MDCFilter in the relevant section of the manual.

SiftingAppender

SiftingAppender is an amazingly versatile appender. It can be used to separate (or sift) logging according to any given runtime attribute. For example, SiftingAppender can separate logging events according to user sessions, so that the logs generated by each user go into distinct log files, one log file per user.

Stack traces with packaging data

When logback prints an exception, the stack trace will include packaging data. Here is a sample stack trace generated by the logback-demo web-application.

14:28:48.835 [btpool0-7] INFO  c.q.l.demo.prime.PrimeAction - 99 is not a valid value
java.lang.Exception: 99 is invalid
  at ch.qos.logback.demo.prime.PrimeAction.execute(PrimeAction.java:28) [classes/:na]
  at org.apache.struts.action.RequestProcessor.processActionPerform(RequestProcessor.java:431) [struts-1.2.9.jar:1.2.9]
  at org.apache.struts.action.RequestProcessor.process(RequestProcessor.java:236) [struts-1.2.9.jar:1.2.9]
  at org.apache.struts.action.ActionServlet.doPost(ActionServlet.java:432) [struts-1.2.9.jar:1.2.9]
  at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) [servlet-api-2.5-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) [jetty-6.1.12.jar:6.1.12]
  at ch.qos.logback.demo.UserServletFilter.doFilter(UserServletFilter.java:44) [classes/:na]
  at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1115) [jetty-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:361) [jetty-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) [jetty-6.1.12.jar:6.1.12]
  at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) [jetty-6.1.12.jar:6.1.12]

From the above, you can recognize that the application is using Struts version 1.2.9 and was deployed under jetty version 6.1.12. Thus, stack traces will quickly inform the reader about the classes intervening in the exception but also the package and package versions they belong to. When your customers send you a stack trace, as a developer you will no longer need to ask them to send you information about the versions of packages they are using. The information will be part of the stack trace. See "%xThrowable" conversion word for details.

This feature can be quite helpful to the point that some users mistakenly consider it a feature of their IDE.

Logback-access, i.e. HTTP-access logging with brains, is an integral part of logback

Last but not least, the logback-access module, part of the logback distribution, integrates with Servlet containers such as Jetty or Tomcat to provide rich and powerful HTTP-access log functionality. Since logback-access was part of the initial design, all the logback-classic features you love are available in logback-access as well.

In summary

We have listed a number of reasons for preferring logback over log4j. Given that logback builds upon on our previous work on log4j, simply put, logback is just a better log4j.

To understand the evolution of logging in Java and JEE world, refer to this article by Micheal Andrews.

Wednesday, November 28, 2012

10 Tips for Proper Application Logging

Source: http://www.javacodegeeks.com/2011/01/10-tips-proper-application-logging.html

10 Tips for Proper Application Logging

Our latest JCP partner, Tomasz Nurkiewicz, has submitted a number of posts describing the basic principles of proper application logging. I found them quite interesting, thus I decided to aggregate them in a more compact format and present them to you. So, here are his suggestions for clean and helpful logs:
(NOTE: The original posts have been slightly edited to improve readability)
1) Use the appropriate tools for the job
Many programmers seem to forget how important is logging an application’s behavior and its current activity. When somebody puts:

`1`	`log.info("Happy and carefree logging");`

happily somewhere in the code, he probably doesn’t realize the importance of application logs during maintenance, tuning and failure identification. Underestimating the value of good logs is a terrible mistake.
In my opinion, SLF4J is the best logging API available, mostly because of a great pattern substitution support:

`1`	`log.debug("Found {} records matching filter: '{}'", records, filter);`

In Log4j you would have to use:

`1`	`log.debug("Found "` `+ records +` `" records matching filter: '"` `+ filter +` `"'");`

This is not only longer and less readable, but also inefficient because of extensive use of string concatenation. SLF4J adds a nice {} substitution feature. Also, because string concatenation is avoided and toString() is not called if the logging statement is filtered, there is no need for isDebugEnabled() anymore. BTW, have you noticed single quotes around filter string parameter?
SLF4J is just a façade. As an implementation I would recommend the Logback framework, already advertised, instead of the well established Log4J. It has many interesting features and, in contrary to Log4J, is actively developed.
The last tool to recommend is Perf4J. To quote their motto:
Perf4J is to System.currentTimeMillis() as log4j is to System.out.println()
I’ve added Perf4J to one existing application under heavy load and seen it in action in few other. Both administrators and business users were impressed by the nice graphs produced by this simple utility. Also we were able to discover performance flaws in no time. Perf4J itself deserves its own article, but for now just check their Developer Guide.
Additionally, note that Ceki Gülcü (founder of the Log4J, SLF4J and Logback projects) suggested a simple approach to get rid of commons-logging dependency (see his comment).

2) Don’t forget, logging levels are there for you
Every time you make a logging statement, you think hard which logging level is appropriate for this type of event, don’t you? Somehow 90% of programmers never pay attention to logging levels, simply logging everything on the same level, typically INFO or DEBUG. Why? Logging frameworks have two major benefits over System.out., i.e. categories and levels. Both allow you to selectively filter logging statements permanently or only for diagnostics time. If you really can’t see the difference, print this table and look at it every time you start typing “log.” in your IDE:
ERROR – something terribly wrong had happened, that must be investigated immediately. No system can tolerate items logged on this level. Example: NPE, database unavailable, mission critical use case cannot be continued.
WARN – the process might be continued, but take extra caution. Actually I always wanted to have two levels here: one for obvious problems where work-around exists (for example: “Current data unavailable, using cached values”) and second (name it: ATTENTION) for potential problems and suggestions. Example: “Application running in development mode” or “Administration console is not secured with a password”. The application can tolerate warning messages, but they should always be justified and examined.
INFO – Important business process has finished. In ideal world, administrator or advanced user should be able to understand INFO messages and quickly find out what the application is doing. For example if an application is all about booking airplane tickets, there should be only one INFO statement per each ticket saying “[Who] booked ticket from [Where] to [Where]“. Other definition of INFO message: each action that changes the state of the application significantly (database update, external system request).
DEBUG – Developers stuff. I will discuss later what sort of information deserves to be logged.
TRACE – Very detailed information, intended only for development. You might keep trace messages for a short period of time after deployment on production environment, but treat these log statements as temporary, that should or might be turned-off eventually. The distinction between DEBUG and TRACE is the most difficult, but if you put logging statement and remove it after the feature has been developed and tested, it should probably be on TRACE level.
The list above is just a suggestion, you can create your own set of instructions to follow, but it is important to have some. My experience is that always everything is logged without filtering (at least from the application code), but having the ability to quickly filter logs and extract the information with proper detail level, might be a life-saver.
The last thing worth mentioning is the infamous is*Enabled() condition. Some put it before every logging statement:

1if(log.isDebugEnabled())

2    log.debug("Place for your commercial");

Personally, I find this idiom being just clutter that should be avoided. The performance improvement (especially when using SLF4J pattern substitution discussed previously) seems irrelevant and smells like a premature optimization. Also, can you spot the duplication? There are very rare cases when having explicit condition is justified – when we can prove that constructing logging message is expensive. In other situations, just do your job of logging and let logging framework do its job (filtering).
3) Do you know what you are logging?
Every time you issue a logging statement, take a moment and have a look at what exactly will land in your log file. Read your logs afterwards and spot malformed sentences. First of all, avoid NPEs like this:

`1`	`log.debug("Processing request with id: {}", request.getId());`

Are you absolutely sure that request is not null here?
Another pitfall is logging collections. If you fetched collection of domain objects from the database using Hibernate and carelessly log them like here:

`1`	`log.debug("Returning users: {}", users);`

SLF4J will call toString() only when the statement is actually printed, which is quite nice. But if it does… Out of memory error, N+1 select problem, thread starvation (logging is synchronous!), lazy initialization exception, logs storage filled completely – each of these might occur.
It is a much better idea to log, for example, only ids of domain objects (or even only size of the collection). But making a collection of ids when having a collection of objects having getId() method is unbelievably difficult and cumbersome in Java. Groovy has a great spread operator (users*.id), in Java we can emulate it using the Commons Beanutils library:

`1`	`log.debug("Returning user ids: {}", collect(users,` `"id"));`

Where collect() method can be implemented as follows:

1public static Collection collect(Collection collection, String propertyName) {

2    return CollectionUtils.collect(collection, new BeanToPropertyValueTransformer(propertyName));

3}

The last thing to mention is the improper implementation or usage of toString(). First, create toString() for each class that appears anywhere in logging statements, preferably using ToStringBuilder (but not its reflective counterpart). Secondly, watch out for arrays and non-typical collections. Arrays and some strange collections might not have toString() implemented calling toString() of each item. Use Arrays #deepToString JDK utility method. And read your logs often to spot incorrectly formatted messages.
4) Avoid side effects
Logging statements should have no or little impact on the application’s behavior. Recently a friend of mine gave an example of a system that threw Hibernates’ LazyInitializationException only when running on some particular environment. As you’ve probably guessed from the context, some logging statement caused lazy initialized collection to be loaded when session was attached. On this environment the logging levels were increased and collection was no longer initialized. Think how long would it take you to find a bug without knowing this context?
Another side effect is slowing the application down. Quick answer: if you log too much or improperly use toString() and/or string concatenation, logging has a performance side effect. How big? Well, I have seen server restarting every 15 minutes because of a thread starvation caused by excessive logging. Now this is a side effect! From my experience, few hundreds of MiB is probably the upper limit of how much you can log onto disk per hour.
Of course if logging statement itself fails and causes business process to terminate due to exception, this is also a huge side effect. I have seen such a construct to avoid this:

1try {

2    log.trace("Id=" + request.getUser().getId() + " accesses " + manager.getPage().getUrl().toString())

3} catch(NullPointerException e) {}

This is a real code, but please make the world a bit better place and don’t do it, ever.
5) Be concise and descriptive
Each logging statement should contain both data and description. Consider the following examples:

1log.debug("Message processed");

2log.debug(message.getJMSMessageID());

3

4log.debug("Message with id '{}' processed", message.getJMSMessageID());

Which log would you like to see while diagnosing failure in an unknown application? Believe me, all the examples above are almost equally common. Another anti-pattern:

1if(message instanceof TextMessage)

2    //...

3else

4    log.warn("Unknown message type");

Was it so hard to include thee actual message type, message id, etc. in the warning string? I know something went wrong, but what? What was the context?
A third anti-pattern is the “magic-log”. Real life example: most programmers in the team knew that 3 ampersands followed by exclamation mark, followed by hash, followed by pseudorandom alphanumeric string log means “Message with XYZ id received”. Nobody bothered to change the log, simply someone hit the keyboard and chose some unique “&&&!#” string, so that it can be easily found by himself.
As a consequence, the whole logs file looks like a random sequence of characters. Somebody might even consider that file to be a valid Perl program. Instead, a log file should be readable, clean and descriptive. Don’t use magic numbers, log values, numbers, ids and include their context. Show the data being processed and show its meaning. Show what the program is actually doing. Good logs can serve as a great documentation of the application code itself.
Did I mention not to log passwords and any personal information? Don’t!
6) Tune your pattern
Logging pattern is a wonderful tool, that transparently adds a meaningful context to every logging statement you make. But you must consider very carefully which information to include in your pattern. For example, logging date when your logs roll every hour is pointless as the date is already included in the log file name. On the contrary, without logging the thread name you would be unable to track any process using logs when two threads work concurrently – the logs will overlap. This might be fine in single-threaded applications – that are almost dead nowadays.
From my experience, the ideal logging pattern should include (of course except the logged message itself): current time (without date, milliseconds precision), logging level, name of the thread, simple logger name (not fully qualified) and the message. In Logback it is something like:

1<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">

2    <encoder>

3        <pattern>%d{HH:mm:ss.SSS} %-5level [%thread][%logger{0}] %m%n</pattern>

4    </encoder>

5</appender>

You should never include file name, class name and line number, although it’s very tempting. I have even seen empty log statements issued from the code:

`1`	`log.info("");`

because the programmer assumed that the line number will be a part of the logging pattern and he knew that “If empty logging message appears in 67th line of the file (in authenticate() method), it means that the user is authenticated”. Besides, logging class name, method name and/or line number has a serious performance impact.
A somewhat more advanced feature of a logging frameworks is the concept of Mapped Diagnostic Context. MDC is simply a map managed on a thread-local basis. You can put any key-value pair in this map and since then every logging statement issued from this thread is going to have this value attached as part of the pattern.
7) Log method arguments and return values
When you find a bug during development, you typically run a debugger trying to track down the potential cause. Now imagine for a while that you can’t use a debugger. For example, because the bug manifested itself on a customer environment few days ago and everything you have is logs. Would you be able to find anything in them?
If you follow the simple rule of logging each method input and output (arguments and return values), you don’t even need a debugger any more. Of course, you must be reasonable but every method that: accesses external system (including database), blocks, waits, etc. should be considered. Simply follow this pattern:

1public String printDocument(Document doc, Mode mode) {

2    log.debug("Entering printDocument(doc={}, mode={})", doc, mode);

3    String id = //Lengthy printing operation

4    log.debug("Leaving printDocument(): {}", id);

5    return id;

6}

Because you are logging both the beginning and the end of method invocation, you can manually discover inefficient code and even detect possible causes of deadlocks and starvation – simply by looking after “entering” without corresponding “leaving”. If your methods have meaningful names, reading logs would be a pleasure. Also, analyzing what went wrong is much simpler, since on each step you know exactly what has been processed. You can even use a simple AOP aspect to log a wide range of methods in your code. This reduces code duplication, but be careful, since it may lead to enormous amount of huge logs.
You should consider DEBUG or TRACE levels as best suited for these types of logs. And if you discover some method are called too often and logging might harm performance, simply decrease logging level for that class or remove the log completely (maybe leaving just one for the whole method invocation?) But it is always better to have too much rather than too few logging statements. Treat logging statements with the same respect as unit tests – your code should be covered with logging routines as it is with unit tests. No part of the system should stay with no logs at all. Remember, sometimes observing logs rolling by is the only way to tell whether your application is working properly or hangs forever.
8) Watch out for external systems
This is the special case of the previous tip: if you communicate with an external system, consider logging every piece of data that comes out from your application and gets in. Period. Integration is a tough job and diagnosing problems between two applications (think two different vendors, environments, technology stacks and teams) is particularly hard. Recently, for example, we’ve discovered that logging full messages contents, including SOAP and HTTP headers in Apache CXF web services is extremely useful during integration and system testing.
This is a big overhead and if performance is an issue, you can always disable logging. But what is the point of having a fast, but broken application, that no one can fix? Be extra careful when integrating with external systems and prepare to pay that cost. If you are lucky and all your integration is handled by an ESB, then the bus is probably the best place to log every incoming request and response. See for example Mules’ log-component.
Sometimes the amount of information exchanged with external systems makes it unacceptable to log everything. On the other hand during testing and for a short period of time on production (for example when something wrong is happening), we would like to have everything saved in logs and are ready to pay performance cost. This can be achieved by carefully using logging levels. Just take a look at the following idiom:

1Collection<Integer> requestIds = //...

2if(log.isDebugEnabled())

3    log.debug("Processing ids: {}", requestIds);

4else

5    log.info("Processing ids size: {}", requestIds.size());

If this particular logger is configured to log DEBUG messages, it will print the whole requestIds collection contents. But if it is configured to print INFO messages, only the size of collection will be outputted. If you are wondering why I forgot about isInfoEnabled() condition, go back to tip #2. One thing worth mentioning is that requestIds collection should not be null in this case. Although it will be logged correctly as null if DEBUG is enabled, but big fat NullPointerException will be thrown if logger is configured to INFO. Remember my lesson about side effects in tip #4?
9) Log exceptions properly
First of all, avoid logging exceptions, let your framework or container (whatever it is) do it for you. There is one, ekhem, exception to this rule: if you throw exceptions from some remote service (RMI, EJB remote session bean, etc.), that is capable of serializing exceptions, make sure all of them are available to the client (are part of the API). Otherwise the client will receive NoClassDefFoundError: SomeFancyException instead of the “true” error.
Logging exceptions is one of the most important roles of logging at all, but many programmers tend to treat logging as a way to handle the exception. They sometimes return default value (typically null, 0 or empty string) and pretend that nothing has happened. Other times they first log the exception and then wrap it and throw it back:

1log.error("IO exception", e);

2throw new MyCustomException(e);

This construct will almost always print the same stack trace two times, because something will eventually catch MyCustomException and log its cause. Log, or wrap and throw back (which is preferable), never both, otherwise your logs will be confusing.
But if we really do WANT to log the exception? For some reason (because we don’t read APIs and documentation?), about half of the logging statements I see are wrong. Quick quiz, which of the following log statements will log the NPE properly?

01try {

02    Integer x = null;

03    ++x;

04} catch (Exception e) {

05    log.error(e);        //A

06    log.error(e, e);        //B

07    log.error("" + e);        //C

08    log.error(e.toString());        //D

09    log.error(e.getMessage());        //E

10    log.error(null, e);        //F

11    log.error("", e);        //G

12    log.error("{}", e);        //H

13    log.error("{}", e.getMessage());        //I

14    log.error("Error reading configuration file: " + e);        //J

15    log.error("Error reading configuration file: " + e.getMessage());        //K

16    log.error("Error reading configuration file", e);        //L

17}

Surprisingly, only G and preferably L are correct! A and B don’t even compile in SLF4J, others discard stack traces and/or print improper messages. For example, E will not print anything as NPE typically doesn’t provide any exception message and the stack trace won’t be printed as well. Remember, the first argument is always the text message, write something about the nature of the problem. Don’t include exception message, as it will be printed automatically after the log statement, preceding stack trace. But in order to do so, you must pass the exception itself as the second argument.
10) Logs easy to read, easy to parse
There are two groups of receivers particularly interested in your application logs: human beings (you might disagree, but programmers belong to this group as well) and computers (typically shell scripts written by system administrators). Logs should be suitable for both of these groups. If someone looking from behind your back at your application logs sees (source Wikipedia):

then you probably have not followed my tips. Logs should be readable and easy to understand just like the code should.
On the other hand, if your application produces half GB of logs each hour, no man and no graphical text editor will ever manage to read them entirely. This is where old-school grep, sed and awk come in handy. If it is possible, try to write logging messages in such a way, that they could be understood both by humans and computers, e.g. avoid formatting of numbers, use patterns that can be easily recognized by regular expressions, etc. If it is not possible, print the data in two formats:

1log.debug("Request TTL set to: {} ({})", new Date(ttl), ttl);

2// Request TTL set to: Wed Apr 28 20:14:12 CEST 2010 (1272478452437)

3

4final String duration = DurationFormatUtils.formatDurationWords(durationMillis, true, true);

5log.info("Importing took: {}ms ({})", durationMillis, duration);

6//Importing took: 123456789ms (1 day 10 hours 17 minutes 36 seconds)

Computers will appreciate “ms after 1970 epoch” time format, while people would be delighted seeing “1 day 10 hours 17 minutes 36 seconds” text. BTW take a look at DurationFormatUtils, nice tool.

Techie Jackie Blogs

Translate

Sunday, February 10, 2013

Effective Logging in Java/JEE

NDC vs MDC - Which one should I use?

NDC

MDC

Which one to use?

Known Gotchas

Faster implementation

Extensive battery of tests

logback-classic speaks SLF4J natively

Extensive documentation

Configuration files in XML or Groovy

Automatic reloading of configuration files

Graceful recovery from I/O failures

Automatic removal of old log archives

Automatic compression of archived log files

Prudent mode

Lilith

Conditional processing of configuration files

Filters

SiftingAppender

Stack traces with packaging data

Logback-access, i.e. HTTP-access logging with brains, is an integral part of logback

In summary

Wednesday, November 28, 2012

10 Tips for Proper Application Logging

10 Tips for Proper Application Logging