Friday, 18 April 2014

How Java logging frameworks determine caller class and method name

This post describes how java logging frameworks (Log4j, for example) know a point (concrete class and method name), where logging statement has been invoked.

I decided to write this post, because quite often I communicate with developers, who do not understant what is the cost of writing caller information to logs, and why such high cost is imposed.

Brief Log4J configuration and usage overview

Logger is identified by its name. Common myth here, is that logger is somehow connected with the class it is declared in. The myth is based on the signature of the factory method:

private static final Logger log = LogManager.getLogger(MyService.class);

Under the hood, however, the method just performs a call to logger repository, passing clazz.getName() (i.e. of type String) as parameter.

Message format in Log4j is controlled by Layout. Usually PatternLayout is used. Important point, is that several placeholders are supported.

Let's take a look at those, that confuse developers the most

  • Category (%c) is the name of the logger
  • Caller class (%C) is the class of the caller

Consider the following example

package com.my.company.name;
public class MyService {

    private static final Logger log1 = LogManager.getLogger(MyService.class);
    private static final Logger log2 = LogManager.getLogger("blah-blah");

    public void f() {
        log1.debug("I'm log1");
        log2.debug("I'm log2");
    }
}

If call to f() is performed and logger is configured with pattern %c - %m%n, log will then contain

com.my.company.name.MyService - I'm log1
blah-blah - I'm log2

Indeed, if pattern is %C - %m%n, then log will contain

com.my.company.name.MyService - I'm log1
com.my.company.name.MyService - I'm log2

Despite of being named blah-blah, log2 somehow determined the caller class name.

How does it work?

JVM keeps track of calls that have been performed in every thread. In brief it is required for JVM to know, where to return after method execution. This information is usually seen if form of stacktrace.

So, the general idea is to get stacktrace and find a StackTraceElement in it, that corresponds to the caller.

Let's take a look at simple example

public class Example {
    public static void main(String[] args) { a(); }
    static void a() { getCallerInfo(); }
    static void getCallerInfo() { b(); }
    static void b() { throw new RuntimeException(); }
}

Method b() throws an exception, it reaches the very top without being caught and its stacktrace gets printed to System.err:

Exception in thread "main" java.lang.RuntimeException
 at Example.b(Example.java:7)
 at Example.getCallerInfo(Example.java:6)
 at Example.a(Example.java:5)
 at Example.main(Example.java:4)

To know a place, from where getCallerInfo() has been called, we have to find an element in stacktrace that is below this method. That's a(). Trivial.

It should be clear now, why having caller in logs is so expensive: each logging statement makes JVM to fill in stacktrace.

Obtaining stacktrace

There are few ways to get stacktrace in Java: Thread.getStackTrace() and Throwable.getStackTrace(). Both methods return an array of StackTraceElement.

Throwable.getStackTrace()

Usage is as simple as

new Throwable().getStackTrace();
Note, that you don't even have to throw a newly created Throwable. Stacktrace gets filled in when you create a Throwable instance. Moreover, you can reuse same Throwable instance calling its fillInStacktrace() to refresh stacktrace when needed (of course, keeping in mind possible race conditions when shared across threads).

Under the hood, native method Throwable.fillInStackTrace(int dummy) gets called eventually.

Thread.getStackTrace()

Usage is

Thread.currentThread().getStackTrace();
Under the hood, though, this code will create new Exception and obtain stacktrace from it.

Flow is a bit different and more complicated, if method is invoked on a Thread object that doesn't represent current thread. Though, this is not a topic of this post.

So, obviously, the first way is preferable just because the latter one doesn't provide any benefits and does exactly the same thing eventually. Also, Thread.getStackTrace() additionally checks, whether the Thread instance represents current thread, and this is not free (I measured Thread.getStackTrace() method to be approximately 10% slower than Throwable.getStackTrace() on stacktraces of <10 elements).

Parsing stacktrace

As I said, it's a trivial task. StackTraceElement has declaringClass and methodName properties. If we have, say, my.company.CallerUtil, that creates new Throwable for us in its method m(), then we should traverse through stacktrace and find the StackTraceElement that is below the one with

"m".equals(methodName) & "my.company.CallerUtil".equals(declaringClass)
If CallerUtil has only that one exposed method and Throwable instance is created right in that method, then "m".equals(methodName) is redundant and we can check only class name for equality.

Here is a simple straightforward implementation of an utility to determine a caller:

public class CallerUtil {

    public final StackTraceElement getCallerOf(Class clazz) {
        StackTraceElement[] stackTrace = new Throwable().getStackTrace();
        // index of StackTraceElement, where clazz appears
        int i = getSelfElementIndex(stackTrace, clazz);
        return stackTrace[i + 1];
    }

    public final StackTraceElement getDirectCaller() {
        StackTraceElement[] stackTrace = new Throwable().getStackTrace();
        // index of StackTraceElement, where our util appears
        int i = getSelfElementIndex(stackTrace, this.getClass());
        return stackTrace[i + 1];
    }

    private int getSelfElementIndex(StackTraceElement[] stackTrace, Class clazz) {
        int i;
        for (i = 0; i < stackTrace.length; i++) {
            StackTraceElement el = stackTrace[i];
            if (el.getClassName().equals(clazz.getName())) {
                break;
            }
        }
        return i;
    }
}
What about logging frameworks?

I believe, that the general idea is very simple and more detailed explanations are not needed. Log4j implementation details are here:

  • org.apache.log4j.spi.LocationInfo
  • org.apache.log4j.helpers.PatternParser.LocationPatternConverter


No comments :

Post a Comment