If you've been profiling C applications, you might know the distinction between time and space profilers. A "time profiler" measures the execution paths of your application on the method level whereas a "space profiler" gives you insight into the development of the heap, such as which methods allocate most memory. Recently, more and more applications are multi-threaded and thread profilers have been developed to analyze thread synchronization issues.
Most or these traditional profilers are "post-mortem" profilers where the profiling wrapper or profiling agent code writes out a data file when the profiled application exits. For an interactive profiler, it makes sense to compare and correlate data from all three domains, so JProfiler combines time, space and thread profilers in a single application.
A profiler must have some means to collect the data it displays. Profiling data can come from an interface in the execution environment or it can be generated by instrumenting the application of the application.
One of the most basic common profilers, the Unix shell command time, acts as a wrapper to the profiled executable and retrieves post-mortem information about the process from the kernel. Profilers for native applications on Microsoft Windows can attach to running applications and receive available debug information to calculate their profiling data. These are examples of interfaces in the execution environment where the the binary of your application are not modified by the profiler.
The gprof Unix profiler (part of Unix since 4.2bsd UNIX in 1983) can be hooked into the compilation process by specifying an additional argument to the compiler (-pg). In this way, profiling code is added to your application. When the application exits, a data file is written to disk that contains call trees and execution times to be viewed with the gprof application. gprof is an example of a profiler that instruments your application.
JProfiler takes a mixed approach. It uses the profiling interface of the JVM and instruments classes at load time for tasks where the profiling interface of the JVM doesn't provide any data or adequate performance.
The profiling interface of the JVM is intended for profiling agents that are written in C or C++. If you open the include directory in your JDK, you will see a number of files with the extension .h. Those are the header files that tell a C/C++ library about the interface that is offered by the JVM. The basis for all communication between a native library and the JVM is the Java Native Interface (JNI), defined in jni.h.
The JNI allows Java code to call methods in the native library and vice versa.
From Java code, you can use the System.load()
call to load a native
library into the same memory space. When you call a method whose declaration contains
the "native" modifier, such as public native String getName();
, a function in
the list of loaded native libraries is searched for. The required name pattern of the
corresponding C-function contains the package, the class and the method of the declaration
in Java code. JNI also defines how Java data types are represented in a C/C++ library.
When the native C-function is called, it gets a "JNI environment" interface as an
additional parameter. With this environment interface, it can call Java methods, convert
between C and Java data types, and perform other JVM specific operation such as creating
Java threads and synchronizing on a Java monitor.
Until Java 1.5, the JVM offered an ad-hoc profiling interface for tool vendors, the Java Virtual Machine Profiling Interface (JVMPI). The JVMPI was not standardized and its behavior varied considerably across different JVMs. In addition, the JVMPI was not able to run with modern garbage collectors and had problems when profiling very large heaps. With Java 1.5, the JVM Tool Interface (JVMTI) was added to the Java platform to overcome these problems. JProfiler supports both JVMPI and JVMTI. The interfaces are defined in in jvmpi.h and jvmti.h They utilize the JNI for communication with the JVM, but provide an additional interface to configure profiling options. JVMPI and JVMTI are event-based systems. The profiling agent library can register handler functions for different events. It can then enable or disable selected events.
Disabling events is important for reducing the overhead of the profiler. For example, in JProfiler, object allocation recording is switched off by default. When you switch on allocation recording in the GUI, the profiling agent tells the JVMPI/JVMTI interface that the event for object allocations should be enabled. If a lot of objects are created, this can produce a considerable overhead, both in the JVM itself as well in the profiling agent that has to perform bookkeeping operations for each event. During the startup phase of an application server, a lot of objects are created that you're most likely not interested in. Consequently, it's a good idea to leave object allocation recording switched off during that time. It increases the performance of the profiled application and reduces clutter in the generated data. The same goes for the measurement of method calls, called "CPU profiling" in JProfiler.
The JVMPI/JVMTI interface offers the following types of event:
synchronize
keyword or call
Object.wait()
, the JVM uses Java monitors. Events that concern these monitors,
such as trying to enter a monitor, entering a monitor, exiting a monitor or waiting on a
monitor are reported to the profiling agent. From this data, the deadlock graph and
the monitor contention views are generated in JProfiler.
Some information, like references between objects as well as the data in objects, are not available from the events that the JVMPI/JVMTI fires. To get exhaustive information on all objects on the heap, the profiling agent can trigger a "heap dump". This command is invoked when you take a snapshot in the heap walker. The heap dump is performed differently for JVMPI and JVMTI: The JVMPI packs all the objects on the heap and the references between them into a single byte array and passes it to the profiling agent. That byte array is then parsed by the profiler and converted to an internal representation. Naturally, the memory requirements of this operation are huge: first, the heap is essentially duplicated in the byte array, then the profiling agent must parse it and translated it to data structures. In order to reduce the peak of the memory requirement, JProfiler saves the byte array to a temporary file on disk, releases the array and parses the contents of the temporary file. When profiling an application that maxes out the available physical memory, taking a heap dump can crash the JVM, simply because not enough physical memory is available to allocate the huge required regions of memory. With JVMTI (>= 1.5) the situation has much improved. With JVMTI, JProfiler can enumerate all existing references in the heap and build up its own data structures.
Unlike a JNI library that you load and invoke from Java code, the profiling agent has to be activated at the very beginning of the JVM startup. This is achieved by adding the special JVM parameters
-Xrunjprofilerfor Java <=1.4.2 (JVMPI) or
-agentpath:[path to jprofilerti library]for Java >=1.5.0 (JVMTI) to the java command line. The
-Xrun
or -agentpath:
parts
tell the JVM that a JVMPI/JVMTI profiling agent should be loaded and the remaining characters
of the parameter constitute the name of the native library. The canonical name of a native
library depends of the platform. For a base name of jprofiler
, the library name
is jprofiler.dll on Microsoft Windows, libjprofiler.so
on Linux and most Unix variants, and libjprofiler.dylib on Mac OS X.
Parameters can be passed to the native profiling library by appending a colon for the JVMPI or an equal sign
for the JVMTI to the profiling interface VM parameter and placing the parameter string behind it. If you pass the
-Xrunjprofiler:port=10000
or -agentpath:[path to jprofilerti library]=port=10000
on
the Java command line, the parameter port=10000
will be passed to the profiling agent.
If the JVM cannot load the specified native library, it quits with an error message. If it succeeds in loading the library, it calls a special function in the library to give the profiling agent a chance to initialize itself.
Unlike basic profilers that collect data and write out a data file to disk, advanced profilers can display the profiling data at runtime. Although it would be possible to start the GUI directly from the profiling agent, it would be a bad idea to do so, since the profiled process would be disturbed by the secondary application and remote profiling would not be possible. Because of this, the JProfiler GUI is started separately and runs in a separate JVM. The communication between the profiling agent and the GUI is via a TCP/IP network socket. This is also the case if you start applications in JProfiler that are configured as "local" sessions.
In order to profile successfully, it's important to choose the right profiling parameters, especially the filters that limit the extent of the recorded call tree. Since this information is required at startup, the profiling agent stops the JVM and waits for a connection from the GUI where these parameters are configured. Once the connection has been established, the profiled application is allowed to start up.
The recorded profiling data resides in the internal data structures of the profiling agent. Only a small part of the recorded data is actually transferred to the GUI. For example, if you open the call tree or the back-traces in the hotspots views, only the next few levels are transferred from the agent to the GUI. If the entire call tree were transferred to the GUI, potentially big amounts of data would have to be transmitted through the socket. This would make the profiled process slower and remote profiling between different computers would not be feasible. In essence, you could say that the profiling agent keeps a database of the recorded profiling data while the GUI is a client that sends user-initiated queries to the database.