zoukankan      html  css  js  c++  java
  • DTrace to Troubleshoot Java Native Memory Problems

    How to Use DTrace to Troubleshoot Java Native Memory Problems on Oracle Solaris 11

    Hands-On Labs of the System Admin and Developer Community of OTN

    by Wang Yu, Vincent Liu, and Gary Wang

    This lab will introduce the basic concepts of DTrace and provide exercises for using DTrace to identify common Java native memory problems, such as direct buffer overflow, Inflater/Deflater memory leak, and thread stack overflow.

    Published February 2014


    Want to comment on this article? Post the link on Facebook's OTN Garage page.  Have a similar article to share? Bring it up on Facebook or Twitter and let's discuss.

    Table of Contents
    Introduction
    Intended Audience
    System Requirements
    Prerequisites
    Notes for Users
    Installation and Configuration (15 Minutes)
    Exercise 1: Introduction to DTrace (30 Minutes)
    Exercise 2: Identifying and Solving "Inflater/Deflater Memory Leak" Problems (30 Minutes)
    Exercise 3: Identifying and Solving "Cannot Create Native Threads" Problems (40 Minutes)
    Exercise 4: Identifying and Solving "Direct ByteBuffer Memory Leak" Problems (40 Minutes)
    Summary
    See Also
    About the Authors

    Introduction

    DTrace is a comprehensive dynamic tracing framework for the Oracle Solaris operating system. DTrace provides a powerful infrastructure for troubleshooting kernel and application problems in production environments. It allows administrators and developers to concisely answer arbitrary questions about the behavior of user applications and operating systems.

    Java native memory leaks are one of the hardest technical issues to solve. JProfiler, VisualVM, heap dump, and thread dump tools have limited capabilities for handling this kind of issue. Google Performance Tools are very popular for identifying issues caused by native memory allocation using malloc. These tools provide a high-performance multithreaded malloc() implementation, with checkpoints for collecting needed information.

    The DTrace facility provided by Oracle Solaris 11 is able to detect and troubleshoot various Java native memory problems. Moreover, DTrace is easy to learn and lightweight, and it can be safely used in a production environment.

    This lab will introduce the basic concepts of DTrace and provide exercises for using DTrace to identify common Java native memory problems, such as direct buffer overflow, Inflater/Deflater memory leak, and thread stack overflow. It will take you approximately 155 minutes to finish this lab.

    Intended Audience

    This hands-on lab assumes you have some basic knowledge about the following technologies:

    • Java language programming
    • Oracle Solaris or a similar UNIX or Linux OS

    System Requirements

    Ensure your system meets the following requirements:

    • Operating system: Solaris 11.1
    • Memory: 2 GB
    • Available disk space: 20 GB

    Prerequisites

    Please download the following software.

    Then go here for instructions on how to create the directory structure, .d files, .java files, and build.xml files used in the lab exercises.

    Notes for Users

    Be aware of the following:

    • For this lab, the GNOME desktop environment is preferred over Oracle Solaris 11 (with desktop packages installed).
    • In order to open a terminal window in GNOME, right-click any point on the background of the desktop, and selectOpen Terminal in the pop-up menu.
    • The following source code editors are provided by the Oracle Solaris 11 desktop environment:

      • vi (type vi in a terminal window)
      • gedit (type gedit in a terminal window)

    Installation and Configuration (15 Minutes)

    In this section, we are going to prepare our hands-on lab environment. Before you start, make sure you have downloaded the required software, as described in the Prerequisites section.

    Install JDK 7

    Use the following procedure to install JDK 7 on Oracle Solaris.

    1. Open a command-line console, and move the jdk-7u45-solaris-i586.gz (or later) file you downloaded to the directory where you want to install the JDK.
    2. Uncompress the file, for example using a command similar to the following:

      $ tar -xvfz jdk-7u45-solaris-i586.gz
      

      This command creates a directory named jdk1.7.0_45 in which the JDK is installed.

    3. Add <JDK_7_Installation>/bin to your PATH variable:

      For example, for an Oracle Solaris system, type the following (if you are using the bash shell), but replace<JDK_7_Installation> with the directory that corresponds to your environment:

      $ export PATH=<JDK_7_Installation>/bin:$PATH
      
    4. Add <JDK_7_Installation> to your JAVA_HOME variable:

      For example, for an Oracle Solaris system, type the following (if you are using the bash shell), but replace<JDK_7_Installation> with the directory that corresponds to your environment:

      $ export JAVA_HOME=<JDK_7_Installation>
      

      Note: The variable setting above affects only the current console; putting it in your profile file ($HOME/.profile for the bash shell) will make this variable available for all the consoles.

    Install Ant

    Use the following procedure to install Ant on Oracle Solaris.

    1. Open a command-line console and move the apache-ant-1.9.2-bin.zip (or later) file you downloaded to the directory where you want to install Ant.
    2. Unzip the file, for example, by typing a command similar to the following:

      $ unzip apache-ant-1.9.2-bin.zip
      

      This unzip command creates a directory called apache-ant-1.9.2 into which Ant is installed.

    3. Add <ant_Installation>/bin to your PATH variable, and add <ant_Installation> to your ANT_HOME variable:

      For example, for an Oracle Solaris system, type the following (if you are using the bash shell), but replace<ant_Installation> with the directory that corresponds to your environment:

      $ export PATH=<ant_Installation>/bin:$PATH
      $ export ANT_HOME=<ant_Installation>
      

      Note: The variable settings above affect only the current console; putting them in your profile file ($HOME/.profile for the bash shell) will make these variables available for all the consoles.

    Verify the Installations

    1. Open a command-line console, and type the jps command. If you see a Java process list similar to the following, the JDK is installed successfully.

      $ jps
      1935 Jps
      1859 App
      
    2. Open a command-line console, and type the ant command. If you see output similar to the following, Ant is installed successfully.

      $ ant
      Buildfile: build.xml does not exist!
      Build failed
      

    Exercise 1: Introduction to DTrace (30 Minutes)

    In this exercise, we are going to learn basic concepts about D language. The D script files used in this exercise are located at <lab_root>/dtrace_javanative/DScripts.

    Background Information: What Is DTrace?

    DTrace is a comprehensive dynamic tracing facility that is built into Oracle Solaris and can be used by administrators and developers on live production systems to examine the behavior of both user programs and the operating system itself. DTrace enables you to explore your system to understand how it works, track down performance problems across many layers of software, or locate the cause of aberrant behavior. DTrace lets you create your own custom programs to dynamically instrument the system and provide immediate, concise answers to arbitrary questions you can formulate using the DTrace D programming language.

    A D program source file consists of one or more probe clauses that describe the instrumentation to be enabled by DTrace. Each probe clause has the following general form:

    probe descriptions
    / predicates /
    {
         action statements;
    }
    

    Exercise 1, Step 1: Write a 'Hello, world!' Program

    1. After logging in, click the right mouse button on the desktop and choose Open Terminal to bring up a terminal window.
    2. DTrace execution requires root user privilege, so switch to root before trying DTrace commands:

      ouser@solaris11:~$ su
      Password: 
      root@solaris11:~#
      
    3. Type dtrace to print the different options for the DTrace command:

      root@solaris11:~# dtrace
      Usage: dtrace [-32|-64] [-aACeFGhHlqSvVwZ] [-b bufsz] [-c cmd] [-D name[=def]]
          [-I path] [-L path] [-o output] [-p pid] [-s script] [-U name]
          [-x opt[=val]] [-X a|c|s|t]
      
          [-P provider [[ predicate ] action ]]
          [-m [ provider: ] module [[ predicate ] action ]]
          [-f [[ provider: ] module: ] func [[ predicate ] action ]]
          [-n [[[ provider: ] module: ] func: ] name [[ predicate ] action ]]
          [-i probe-id [[ predicate ] action ]] [ args ... ]
      
          predicate -> '/' D-expression '/'
             action -> '{' D-statements '}'
      
          -32 generate 32-bit D programs and ELF files
          -64 generate 64-bit D programs and ELF files
      
          -a  claim anonymous tracing state
          -A  generate driver.conf(4) directives for anonymous tracing
          -b  set trace buffer size
          -c  run specified command and exit upon its completion
          -C  run cpp(1) preprocessor on script files
          -D  define symbol when invoking preprocessor
          -e  exit after compiling request but prior to enabling probes
          -f  enable or list probes matching the specified function name
          -F  coalesce trace output by function
          -G  generate an ELF file containing embedded dtrace program
          -h  generate a header file with definitions for static probes
          -H  print included files when invoking preprocessor
          -i  enable or list probes matching the specified probe id
          -I  add include directory to preprocessor search path
          -l  list probes matching specified criteria
          -L  add library directory to library search path
          -m  enable or list probes matching the specified module name
          -n  enable or list probes matching the specified probe name
          -o  set output file
          -p  grab specified process-ID and cache its symbol tables
          -P  enable or list probes matching the specified provider name
          -q  set quiet mode (only output explicitly traced data)
          -s  enable or list probes according to the specified D script
          -S  print D compiler intermediate code
          -U  undefine symbol when invoking preprocessor
          -v  set verbose mode (report stability attributes, arguments)
          -V  report DTrace API version
          -w  permit destructive actions
          -x  enable or modify compiler and tracing options
          -X  specify ISO C conformance settings for preprocessor
          -Z  permit probe descriptions that match zero probes
      
    4. Now let's try to write a "Hello, world!" program using D language:

      ouser@solaris11:~# dtrace -n 'BEGIN {trace("hello,world");exit(0);}'
      dtrace: description 'BEGIN ' matched 1 probe
       CPU     ID                    FUNCTION:NAME
         0      1                           :BEGIN   hello,world
      

      Alternatively, you can compose a script like the <lab_root>/dtrace_javanative/DScripts/ex1-hello.d file:

      BEGIN 
      {
          trace("hello, world");
          exit(0);
      }
      

      Then run the script file:

      root@solaris11:~# dtrace -s ex1-hello.d
      dtrace: script 'ex1-hello.d' matched 1 probe
       CPU     ID                    FUNCTION:NAME
         0      1                           :BEGIN   hello,world
      

    Exercise 1, Step 2: Explore Probes

    A probe is a location or activity to which DTrace can bind a request to perform a set of actions, such as recording a stack trace, a time stamp, or the argument to a function. Probes are like programmable sensors scattered all over your Oracle Solaris system in interesting places. If you want to figure out what's going on, you use DTrace to program the appropriate sensors to record the information that is of interest to you.

    1. Let's see what kinds of probes there are in Oracle Solaris 11.1 by running the following command: 

      root@solaris11:~# dtrace -l | more
         ID   PROVIDER            MODULE                          FUNCTION NAME
          1     dtrace                                                     BEGIN
          2     dtrace                                                     END
          3     dtrace                                                     ERROR
          4 javascript2010         libxul.so __1cCjsHExecute6FpnJJSContext_pnIJSScript_
      rnIJSObject_pnCJSFValue__b_ execute-done
          5 javascript2010         libxul.so __1cCjsNExecuteKernel6FpnJJSContext_pnIJSS
      cript_rnIJSObject_rknCJSFValue_n0ALExecuteType_pn0AKStackFrame_p8_b_ execute-done
          6 javascript2010         libxul.so __1cCjsHExecute6FpnJJSContext_pnIJSScript_
      rnIJSObject_pnCJSFValue__b_ execute-start
          7 javascript2010         libxul.so __1cCjsNExecuteKernel6FpnJJSContext_pnIJSS
      cript_rnIJSObject_rknCJSFValue_n0ALExecuteType_pn0AKStackFrame_p8_b_ execute-start
          8 nfsmapid1010          nfsmapid                  cb_update_domain daemon-domain
          9 kerberos995    mech_krb5.so.1                         k5_mk_rep krb_ap_rep-make
         10 kerberos995    mech_krb5.so.1                       krb5_rd_rep krb_ap_rep-read
         11 kerberos995    mech_krb5.so.1              krb5_mk_req_extended krb_ap_req-make
         12 kerberos995    mech_krb5.so.1                rd_req_decoded_opt krb_ap_req-read
         13 kerberos995    mech_krb5.so.1                     krb5_mk_ncred krb_cred-make
         14 kerberos995    mech_krb5.so.1                krb5_rd_cred_basic krb_cred-read
         15 kerberos995    mech_krb5.so.1                     krb5_mk_error krb_error-make
         16 kerberos995    mech_krb5.so.1                     krb5_rd_error krb_error-read
         17 kerberos995    mech_krb5.so.1               krb5_encode_kdc_rep krb_kdc_rep-make
      --More--
      

      Every probe in DTrace has two names: a unique integer ID and a human-readable string name that is composed of four parts, shown as separate columns in the dtrace output. When writing out the full human-readable name of a probe, write all four parts of the name separated by colons like this:

      provider:module:function:name
      
    2. Type the following command to determine how many probes are available:

      root@solaris11:~# dtrace -l | wc -l
         81183
      

      Please note that the number of probes varies depending on your OS and the software you have installed.

    3. In order to locate a DTrace probe, we can also use the following options to filter the list: 

      • -P for provider
      • -m for module
      • -f for function
      • -n for name

      For instance, if you want to list all the probes from the syscall provider, run the following command:

      root@solaris11:~# dtrace -l -P syscall | more
         ID   PROVIDER            MODULE                          FUNCTION NAME
       7067    syscall                                               nosys entry
       7068    syscall                                               nosys return
       7069    syscall                                               rexit entry
       7070    syscall                                               rexit return
       7071    syscall                                                read entry
       7072    syscall                                                read return
       7073    syscall                                               write entry
       7074    syscall                                               write return
       7075    syscall                                               close entry
       7076    syscall                                               close return
       7077    syscall                                              linkat entry
       7078    syscall                                              linkat return
       7079    syscall                                           symlinkat entry
       7080    syscall                                           symlinkat return
       7081    syscall                                               chdir entry
       7082    syscall                                               chdir return
       7083    syscall                                               gtime entry
       7084    syscall                                               gtime return
       7085    syscall                                                 brk entry
       7086    syscall                                                 brk return
       7087    syscall                                               lseek entry
      --More--
      

    Exercise 1, Step 3: Explore Predicates and Actions

    Predicates are expressions enclosed between slashes (/ and /) that are evaluated at probe-firing time to determine whether the associated actions should be executed. Predicates are the primary conditional construct used for building more complex control flow in a D program. You can omit the predicate section of the probe clause entirely for any probe, in which case the actions are always executed when the probe fires.

    Probe actions are described by a list of statements separated by semicolons (;) and enclosed between braces ({ and }). If you want to note only that a particular probe fired on a particular CPU without tracing any data or performing any additional actions, you can specify an empty set of braces with no statements inside.

    1. Now you will use a script to list all system calls, excluding write, for all the processes running on your current system. See the following code in <lab_root>/dtrace_javanative/DScripts/ex1-predicate-1.d:
      syscall:::
      /probefunc!="write"/
      {
          printf("probefunc:%s, pid:%d, execname:%s
      ",probefunc, pid, execname);
      }
      
    2. Run ex1-predicate-1.d. You might get a result similar to the following. Press Ctrl-c to stop this program.

      root@solaris11:~# dtrace -qs ex1-predicate-1.d
      ^C
      probefunc:ioctl, pid:2084, execname:dtrace
      probefunc:ioctl, pid:2084, execname:dtrace
      probefunc:ioctl, pid:2084, execname:dtrace
      probefunc:ioctl, pid:2084, execname:dtrace
      probefunc:portfs, pid:1829, execname:nautilus
      probefunc:portfs, pid:1829, execname:nautilus
      probefunc:portfs, pid:1829, execname:nautilus
      probefunc:portfs, pid:1829, execname:nautilus
      probefunc:fstatat64, pid:1829, execname:nautilus
      probefunc:fstatat64, pid:1829, execname:nautilus
      probefunc:clock_gettime, pid:1829, execname:nautilus
      probefunc:clock_gettime, pid:1829, execname:nautilus
      probefunc:recv, pid:1829, execname:nautilus
      probefunc:recv, pid:1829, execname:nautilus
      probefunc:clock_gettime, pid:1829, execname:nautilus
      probefunc:clock_gettime, pid:1829, execname:nautilus
      probefunc:pollsys, pid:1829, execname:nautilus
      probefunc:mmap, pid:2084, execname:dtrace
      probefunc:mmap, pid:2084, execname:dtrace
      probefunc:lwp_park, pid:2084, execname:dtrace
      probefunc:recv, pid:1871, execname:VBoxClient
      probefunc:recv, pid:1871, execname:VBoxClient
      probefunc:nanosleep, pid:1871, execname:VBoxClient
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:nanosleep, pid:1871, execname:VBoxClient
      probefunc:recv, pid:1871, execname:VBoxClient
      probefunc:recv, pid:1871, execname:VBoxClient
      probefunc:nanosleep, pid:1871, execname:VBoxClient
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      
    3. In order to discover all of the no-write system calls caused by Java, we need to update the original file to be like ex1-predicate-2.d, as shown below:

      syscall:::
      /probefunc!="write" && execname=="java"/
      {
          printf("probefunc:%s, pid:%d, execname:%s
      ",probefunc, pid, execname);
      }
      
    4. Run ex1-predicate-2.d. You will get results similar to this:

      root@solaris11:~# dtrace -qs ex1-predicate-2.d
      ^C
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_wait, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      probefunc:lwp_cond_signal, pid:1844, execname:java
      

    Exercise 1, Step 4: Explore Aggregations

    When instrumenting the system to answer performance-related questions, it is useful to consider how data can be aggregated to answer a specific question rather than thinking in terms of data gathered by individual probes. For example, if you wanted to know the number of system calls by user ID, you would not necessarily care about the data collected at each system call. You simply want to see a table of user IDs and system calls. In DTrace, you will find it is very easy to achieve this by using aggregations.

    In D language, the syntax for an aggregation is the following:

    @name[ keys ] = aggfunc ( args );
    

    where:

    • name is the name of the aggregation.
    • keys is a comma-separated list of D expressions.
    • aggfunc is a DTrace aggregating function.
    • args is a comma-separated list of arguments appropriate for the aggregating function.
    1. In this example, we'd like to know how many system calls are invoked within five seconds. Here's the code in the<lab_root>/dtrace_javanative/DScripts/ex1-aggregation-1.d file.

      syscall:::entry
      {
         @counts["syscall numbers"]=count();
      }
      tick-5sec
      {
          exit(0);
      }
      
    2. Run ex1-aggregation-1.d, which will exit in five seconds; you will see results similar to this:

      root@solaris11:~# dtrace -s ex1-aggregation-1.d
      dtrace: script 'ex1-aggregation-1.d' matched 215 probes
       CPU     ID                    FUNCTION:NAME
         0  81183                       :tick-5sec 
      
        syscall numbers                                                3420
      
    3. To take a further step to check which application is making the most system calls, we need to update the original file to be like ex1-aggregation-2.d, as shown below:

      syscall:::entry
      {
          @counts[execname]=count();
      }
      tick-5sec
      {
          exit(0);
      }
      
    4. Run the ex1-aggregation-2.d file, which will give you results like this:

      root@solaris11:~# dtrace -s ex1-aggregation-2.d
      dtrace: script 'ex1-aggregation-2.d' matched 215 probes
       CPU     ID                    FUNCTION:NAME
         0  81183                       :tick-5sec 
      
        automountd                                                        1
        dhcpagent                                                         3
        nwam-manager                                                      4
        devfsadm                                                          6
        gnome-settings-d                                                  7
        isapython2.6                                                      8
        sendmail                                                         10
        metacity                                                         11
        xscreensaver                                                     12
        gnome-power-mana                                                 15
        updatemanagernot                                                 16
        VBoxService                                                      37
        VBoxClient                                                      354
        gnome-terminal                                                  361
        Xorg                                                            393
        java                                                            596
        dtrace                                                         1573
      
    5. The following example, ex1-aggregation-3.d, displays the average time spent in the write system call, organized by process name. This example uses the avg aggregating function, specifying time stamp - self->ts as the argument. The example averages the wall clock time spent in the system call:

      syscall:::entry
      {
          self->ts=timestamp;
      }
      syscall:::return
      /self->ts/
      {
          @time[execname]=avg(timestamp-self->ts);
          self->ts=0;
      }
      

      About self->: Thread-Local Variables
      DTrace provides the ability to declare variable storage that is local to each operating system thread, as well as the global variables mentioned previously. Thread-local variables are useful when we want to enable a probe and mark every thread that fires the probe with some tags or data. In a D program, we can use thread-local variables to share a common name but refer to different data storage associated with different threads. Thread-local variables are referenced by applying the -> operator to the special identifier self.
    6. Run the ex1-aggregation-3.d file. You need to wait for several seconds, and then press Ctrl-c. A result similar to Listing 1 will be shown:

      root@solaris11:~# dtrace -s ex1-aggregation-3.d
      dtrace: script 'ex1-aggregation-3.d' matched 428 probes
      ^C
      
        fmd                                                           18894
        nscd                                                          25965
        utmpd                                                         28892
        ssh-agent                                                     37801
        gconfd-2                                                      79348
        nwam-manager                                                  79479
        hald                                                          88817
        hald-addon-acpi                                               95590
        dtrace                                                      3307590
        VBoxClient                                                 13981578
        Xorg                                                       15435316
        gnome-terminal                                             20763581
        java                                                       40776641
        gnome-power-mana                                           58499707
        xscreensaver                                              184487447
        updatemanagernot                                          241129100
        sendmail                                                  280901476
        VBoxService                                               358442957
        gnome-settings-d                                          392415577
        metacity                                                  409615593
        isapython2.6                                              502919529
        dhcpagent                                                 800092245
        devfsadm                                                  914232179
      

      Listing 1

      The average time records shown in Listing 1 are displayed in nanoseconds.

    7. If you want to know the distribution of system call duration from different applications, you need to use the quantizeaggregation function in ex1-aggregation-4.d:

      syscall:::entry
      {
          self->ts=timestamp;
      }
      syscall:::return
      /self->ts/
      {
          @time[execname]=quantize(timestamp-self->ts);
          self->ts=0;
      }
      
    8. Run ex1-aggregation-4.d and wait for several seconds. You should get a result like Listing 2: 

      root@solaris11:~# dtrace -s ex1-aggregation-4.d
      dtrace: script 'ex1-aggregation-4.d' matched 428 probes
      ^C
      
        nwam-manager                                      
                 value  ------------- Distribution ------------- count    
                 16384 |                                         0        
                 32768 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 3        
                 65536 |                                         0        
      
        sendmail                                          
                 value  ------------- Distribution ------------- count    
                   512 |                                         0        
                  1024 |@@@@                                     1        
                  2048 |@@@@@@@@@@@@@                            3        
                  4096 |                                         0        
                  8192 |@@@@                                     1        
                 16384 |@@@@                                     1        
                 32768 |@@@@@@@@@@@@@                            3        
                 65536 |                                         0        
      
        gnome-power-mana                                  
                 value  ------------- Distribution ------------- count    
                  1024 |                                         0        
                  2048 |@@@@@@@@@@@                              4        
                  4096 |@@@@@@@@@                                3        
                  8192 |@@@@@@@@@                                3        
                 16384 |                                         0        
                 32768 |                                         0        
                 65536 |                                         0        
                131072 |@@@                                      1        
                262144 |@@@                                      1        
                524288 |                                         0        
               1048576 |@@@                                      1        
               2097152 |                                         0        
               4194304 |                                         0        
               8388608 |                                         0        
              16777216 |                                         0        
              33554432 |                                         0        
              67108864 |                                         0        
             134217728 |                                         0        
             268435456 |@@@                                      1        
             536870912 |                                         0  
          ...
      

      Listing 2

      Note that the rows for the frequency distribution are always power-of-two values. Each row indicates the count of the number of elements greater than or equal to the corresponding value, but less than the next larger row value. For example, the output in Listing 2 shows that sendmail had three system calls taking between 2,048 nanoseconds and 4,096 nanoseconds, inclusive.

    Exercise 2: Identifying and Solving "Inflater/Deflater Memory Leak" Problems (30 Minutes)

    Java Native Interface (JNI) is used to call native code from Java. This native code can allocate native OS resources and also create Java objects. The memory allocated by the JNI code is not managed by the GC (Garbage Collection) threads, so it should be freed by the native code itself. Therefore, memory leaks caused by JNI are particularly nasty and hard to find out.

    In JDK, the java.util.zip.Inflater(Deflater) class provides support for general-purpose decompression using the popular zlib compression library. The Inflater(Deflater) class passes its input and output buffers directly tozlib after acquiring them via the GetPrimitiveArrayCritical JNI call. If these classes are not used properly, there will be native memory leaks that cannot be found by common profiling tools.

    In this exercise, we will learn how to use a simple DTrace script to trace the malloc native syscalls at run time, in order to identify the root cause.

    The D-script files used in this exercise are located in <lab_root>/dtrace_javanative/DScripts. The Java source code files used in this exercise are located in <lab_root>/dtrace_javanative/exercises/ex2.

    Background Information

    Check the following Java classes, which compress and uncompress data by using java.util.zip.Deflater andjava.util.zip.Inflater. The source files (Compresser.java and Decompresser.java) are located in<lab_root>/dtrace_javanative/ exercises/ex2/src/nativememory/compresser.

    package nativememory.compresser;
    
    import java.util.zip.Deflater;
    
    public class Compresser {
        private Deflater deflater;
        
        public Compresser(){
            deflater = new Deflater();
        }
        
        public byte[] compress(byte[] input) {
            byte[] output = new byte[8000];
            deflater.setInput(input);
            deflater.finish();
            int compressedDataLength = deflater.deflate(output);
            //deflater.end();
            byte[] result = new byte[compressedDataLength];
            System.arraycopy(output, 0, result, 0, compressedDataLength);
            return result;
        }
    }
    
    package nativememory.compresser;
    
    import java.util.zip.DataFormatException;
    import java.util.zip.Inflater;
    
    public class Decompresser {
        private Inflater inflater;
        
        public Decompresser(){
            inflater = new Inflater();
        }
        
        public byte[] decompress(byte[] input) throws DataFormatException {
            byte[] output = new byte[8000];
            inflater.setInput(input);
            int uncompressedDataLength = inflater.inflate(output);
            //inflater.end();
            byte[] result = new byte[uncompressedDataLength];
            System.arraycopy(output, 0, result, 0, uncompressedDataLength);
            return result;
        }
    }
    

    In this exercise, we will have a running thread which does a loop of work compressing and decompressing data in the background. The InflatorWorkingObject.java file is located in <lab_root>/dtrace_javanative/ exercises/ex2/src/nativememory.

    public class InflatorWorkingObject extends DefaultWorkingObject{
    
        @Override
        public void loopWork() {
            try {
                // Encode a long String into bytes
                Random generator = new Random();
                int messageID = generator.nextInt(10000);
                String inputString = "This is the data with mmessageID: "+messageID;
                byte[] input = inputString.getBytes("UTF-8");
                byte[] result_compress;
                byte[] result_uncompress;
                
                //compress
                Compresser compresser = new Compresser();
                result_compress = compresser.compress(input);
                System.out.println("Data compressed!");
                //decompress
                Decompresser decompresser = new Decompresser();
                result_uncompress = decompresser.decompress(result_compress); 
                
                // Decode the bytes into a String
                String outputString = new String(result_uncompress,"UTF-8");
                System.out.println("Data Decompressed: "+outputString);
                Thread.sleep(200);
                
            } catch (InterruptedException ex) {
            } catch (java.io.UnsupportedEncodingException ex) {
            } catch (java.util.zip.DataFormatException ex) {
            }
        }
    }
    

    Exercise 2, Step 1: Run a Data Compressing/Decompressing Program that Has Native Memory Leaks

    1. Go to the directory <lab_root>/dtrace_javanative/ exercises/ex2, and execute the ant command as follows:

      ouser@solaris11:~$ cd dtrace_javanative/exercises/ex2
      ouser@solaris11:~/dtrace_javanative/exercises/ex2$ ant
      
    2. The ant command runs a multithread program that simulates multiple users compressing and uncompressing data. After running for a while, the program produces output similar to the following. You can stop this program by pressing the Enter key at any time, and you can rerun it by entering the ant command. 

      Note: In order to finish the following steps, you must keep this program running.

      [java] Data Decompressed: This is the data with mmessageID: 8036
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 6029
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 4046
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 7587
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 6213
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 205
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 5249
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 2687
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 4588
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 1971
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 3686
      [java] Data compressed!
      [java] Data Decompressed: This is the data with mmessageID: 3259
      [java] Data compressed!
      

    Exercise 2, Step 2: Monitor the Program with JDK Tools

    1. Use the jps tool to find the process ID (pid) for this Java program; you will get output like Listing 3:

      ouser@solaris11:~$ jps -v
      1839 App -Djava.security.policy=/usr/share/vpanels/java.policy
      2010 OOW -DDescription=OOWDtraceNativeLab_ex2 -Xms256m -Xmx256m
      2011 Jps -Dapplication.home=/home/gary/sf_OOW/jdk1.7.0_25 -Xms8m
      2008 Launcher -Dant.home=/home/gary/sf_OOW/apache-ant-1.9.2 -Dant.library.dir=/home/gary/sf_OOW/apache-ant-1.9.2/lib
      

      Listing 3

      By using jps -v, we can find each Java processes with its process ID and startup arguments. From the output in Listing 3, we can tell that the process ID of the above Java program on my machine is 2010 (it might different on your machine), because it has a name of OOW with other arguments. Then we can pick it out from other Java processes.

    2. Use the prstat -p <pid> system command to monitor the memory usage of this process:

      PID  USERNAME  SIZE   RSS STATE   PRI NICE      TIME  CPU PROCESS/NLWP       
      2010 gary      723M  209M sleep    59    0   0:00:05 4.8% java/11
      

      Listing 4

      From the output of prstat in Listing 4, we can see that the process with PID 2010 is consuming more than 700 MB of virtual memory, which is still increasing.

    3. Use the jmap -heap <pid> command to monitor the Java heap usage of this program. (You need to use your process ID to replace <pid>):

      Heap Usage:
      PS Young Generation
      Eden Space:
         capacity = 71630848 (68.3125MB)
         used     = 45988288 (43.85784912109375MB)
         free     = 25642560 (24.45465087890625MB)
         64.20179194304666% used
      From Space:
         capacity = 8912896 (8.5MB)
         used     = 0 (0.0MB)
         free     = 8912896 (8.5MB)
         0.0% used
      To Space:
         capacity = 8912896 (8.5MB)
         used     = 0 (0.0MB)
         free     = 8912896 (8.5MB)
         0.0% used
      tenured generation:
         capacity = 178978816 (170.6875MB)
         used     = 0 (0.0MB)
         free     = 178978816 (170.6875MB)
         0.0% used
      Perm Generation:
         capacity = 12582912 (12.0MB)
         used     = 1838216 (1.7530593872070312MB)
         free     = 10744696 (10.246940612792969MB)
         14.60882822672526% used
      
      690 interned Strings occupying 41984 bytes.
      

      Listing 5

      From the output of jmap in Listing 5, we can see that process with PID 2010 has consumed 68.3 MB (Eden) + 8.5 MB (From) + 8.5 MB (To) + 170.69 MB (tenured) + 12 MB (Perm) of memory, which is less than the memory size (723 MB) detected by prstat. So there must be some native memory leaks in this Java process.

    Exercise 2, Step 3: Write a DTrace Program to Identify the Problem

    1. We know that malloc is the most common way to allocate memory from the system. For this reason, we can first use DTrace to monitor malloc usage to find a clue. In Oracle Solaris 11, malloc is integrated in standard C library functions. So we consider using the pid provider to analyze the behavior of the process. 

      About the pid Provider
      The DTrace pid provider allows you to trace the internal execution of processes such as a Java application or a database. It can trace the entry and return of any function in a user process as well as any instruction specified by an absolute address or function offset. The pid provider is so powerful that you're very likely to use it in your daily work.

      The pid provider actually defines a class of providers. Each process can potentially have its own associated pidprovider. A process with an ID of 123, for example, would be traced by using the pid123 provider. Usually we use a variable such as $target or $1 as a parameter to define the provider dynamically.

      First we will determine how frequently malloc is being called by this Java process. As shown in Listing 6, the script<lab_root>/dtrace_javanative/DScripts/ex2.d calculates the number of malloc and free calls by the process:

      #!/usr/sbin/dtrace -qs
      
      pid$1::malloc:entry
      {
         printf("process(%d) is calling malloc %d times
      ",$1,self->malloc++);
      }
      
      pid$1::free:entry
      {
         printf("process(%d) is calling %d times
      ",$1,self->free++);
      }
      

      Listing 6

      Note$1 is the first argument for running the script, $2 is the second argument, and so on.

    2. Switch to the root user before trying DTrace commands:

      ouser@solaris11:~$ su
      Password: 
      root@solaris11:~#
      
    3. Execute ex2.d with the Java PID. You will get a result like the result shown in Listing 7. Type Ctrl-c to stop the script.

      root@solaris11: ./ex2.d 2010
      process(2010) is calling malloc 0 times
      process(2010) is calling malloc 1 times
      process(2010) is calling malloc 2 times
      process(2010) is calling malloc 3 times
      process(2010) is calling malloc 4 times
      process(2010) is calling malloc 5 times
      process(2010) is calling malloc 6 times
      process(2010) is calling malloc 7 times
      process(2010) is calling malloc 8 times
      process(2010) is calling malloc 9 times
      process(2010) is calling malloc 10 times
      process(2010) is calling malloc 11 times
      process(2010) is calling malloc 12 times
      process(2010) is calling malloc 13 times
      

      Listing 7

      We can see in Listing 7 that most of the output is about malloc calls. Given more time, you might see only a few freecalls.

    4. In order to see more clearly, we can run ex2.1.d, using an aggregate function and the profile provider to observe the result every two seconds. Listing 8 is the source code in which the profile provider is used.

      #!/usr/sbin/dtrace -qs
      pid$1:libc:malloc:entry
      {
          @malloc_count["number of malloc"] = count() ;
      }
      
      pid$1:libc:free:entry
      {
          @free_count["number of free"] = count() ;
      }
      
      profile:::tick-2sec
      {
          printa(@malloc_count) ;
          printa(@free_count) ;
          trunc(@malloc_count) ;
          trunc(@free_count) ;
      }
      

      Listing 8

      About profile-n Probes
      profile-n probe fires every fixed interval on every CPU at high interrupt level. The probe's firing interval is denoted by the value of n: the interrupt source will fire n times per second. n may also have an optional suffix, in which case nis interpreted to be in the units denoted by the suffix. Valid suffixes and the units they denote are as follows:

      Suffix Time Units
      nsec or ns nanoseconds
      usec or us microseconds
      msec or ms milliseconds
      sec or s seconds
      min or m minutes
      hour or h hours
      day or d days
      hz hertz (frequency per second)

      In ex2.1.dprobe profile:::tick-2sec fires every two seconds, prints the number of malloc and free calls recorded during the past two seconds, and clears the buffer using the trunc function.

    5. Run ex2.1.d with the Java PID. You will get a result similar to the output shown in Listing 9:

      root@solaris11:~/DScripts# ./ex2.1.d 2010
      
        number of malloc                                                312
      
        number of malloc                                                304
      
        number of malloc                                                304
      
        number of malloc                                                304
      
        number of malloc                                                304
      
          ^C
      

      Listing 9

      The output in Listing 9 shows that the process with PID 2010 is issuing about 300 malloc calls every two seconds, and there is no free call during this period. (Note that if @free_count is 0dtrace will not allocate buffer for it, so it is not displayed.)

      Now we can say that this should account for the increasing usage of virtual memory for this Java process. Then how we can find the cause and fix the problem?

    6. The ustack action in DTrace can be used to trace the process's stack. So we can use ustack to discover the code path that executes the malloc call. Running ex2.2.d will print out the process's thread stack when malloc is called.
      #!/usr/sbin/dtrace -qs
      
      pid$1:libc:malloc:entry
      {
          @stack_count[ustack(4)] = count() ;
      }
      

      Please note that the input argument of ustack denotes stack frame depth. Here we use 4, which is enough to locate the statements that make malloc calls. You can try different numbers in order to get further information.

    7. Run ex2.2d with the Java PID for a few seconds, and then type Ctrl-c to stop the script. You will see a result similar to Listing 10:

      root@solaris11:~/DScripts# ./ex2.2.d 2010
      ^C
      
                    libc.so.1`malloc
                    libzip.so`zcalloc+0x26
                    libzip.so`deflateInit2_+0x19b
                    libzip.so`Java_java_util_zip_Deflater_init+0x5d
                    145
      
                    libc.so.1`malloc
                    libzip.so`zcalloc+0x26
                    libzip.so`deflateInit2_+0x1b5
                    libzip.so`Java_java_util_zip_Deflater_init+0x5d
                    145
      
                    libc.so.1`malloc
                    libzip.so`zcalloc+0x26
                    libzip.so`deflateInit2_+0x1cf
                    libzip.so`Java_java_util_zip_Deflater_init+0x5d
                    145
      
                    libc.so.1`malloc
                    libzip.so`zcalloc+0x26
                    libzip.so`inflateInit2_+0x88
                    libzip.so`Java_java_util_zip_Inflater_init+0x4f
                    145
      

      Listing 10

      In Listing 10, we can tell that the Inflater and Deflater classes in the java.util.zip package are makingmalloc calls. This is a well-known problem with the Inflater and Deflater classes. Now we can locate the root cause in the Java source code.

      About Inflater and Deflater Memory Usage
      The Inflater and Deflater classes pass their input and output buffers directly to the zlib compression library after acquiring them via the GetPrimitiveArrayCritical JNI call. Both of these classes have finalize()methods that release c-heap buffers allocated by native code. 

      However, the finalizable objects can be discovered only by GC, which occurs when there is pressure on the Java heap. The c-heap memory allocated by the Inflater and Deflater classes will not trigger GC, so the c-heap may be filled up before GC detects and enqueues unreachable finalizable objects.

      The same buffers can be freed before Inflater and Deflater objects are finalized by calling end(). So the preferred way to use Deflater and Inflater is to call the end() method after use in order to avoid native memory leaks.

    Exercise 2, Step 4: Modify the Java Program and Run It Again

    1. Stop the Java program by pressing the Enter key in the console that was started in Exercise 2, Step 1.
    2. Use an editor to open the files named Compresser.java and Decompresser.java, which are located in<lab_root>/dtrace_javanative/ exercises/ex2/src/nativememory/compresser.
    3. Add in the Java statements highlighted in bold in Listing 11:

      package nativememory.compresser;
      
      import java.util.zip.Deflater;
      
      public class Compresser {
          private Deflater deflater;
          
          public Compresser(){
              deflater = new Deflater();
          }
          
          public byte[] compress(byte[] input) {
              byte[] output = new byte[8000];
              deflater.setInput(input);
              deflater.finish();
              int compressedDataLength = deflater.deflate(output);
              deflater.end();
              byte[] result = new byte[compressedDataLength];
              System.arraycopy(output, 0, result, 0, compressedDataLength);
              return result;
          }
      }
      
      
      package nativememory.compresser;
      
      import java.util.zip.DataFormatException;
      import java.util.zip.Inflater;
      
      public class Decompresser {
          private Inflater inflater;
          
          public Decompresser(){
              inflater = new Inflater();
          }
          
          public byte[] decompress(byte[] input) throws DataFormatException {
              byte[] output = new byte[8000];
              inflater.setInput(input);
              int uncompressedDataLength = inflater.inflate(output);
              inflater.end();
              byte[] result = new byte[uncompressedDataLength];
              System.arraycopy(output, 0, result, 0, uncompressedDataLength);
              return result;
          }
      }
      

      Listing 11

    4. Repeat Exercise 2, Step 1 through Step 3 to see the difference.

    Exercise 3: Identifying and Solving "Cannot Create Native Threads" Problems (40 Minutes)

    One of the common native memory problems of Java EE production systems is OutOfMemoryError: unable to create new native thread, thrown by Java HotSpot VM when it is unable to create new Java threads.

    Java HotSpot VM memory is split into three memory spaces:

    • The Java heap
    • The PermGen (permanent generation) space
    • The native memory

    These three parts of the memory space compete with each other for the limited amount of memory. If the Java heap and PermGen occupy too much space, there will be little left for the native memory. Please note that the maximum amount of virtual memory space for a 32-bit Java Virtual Machine (JVM) process is 4 GB (perhaps much lower on Windows or Linux).

    Java threads require native memory from the OS. If this space is not big enough, the OS will refuse any further native thread and memory allocation and "unable to create new native thread" errors will be thrown. In this exercise, we will use a DTrace script to detect the native memory leaks caused by Java thread creation.

    The D-script files used in this exercise are located in <lab_root>/dtrace_javanative/DScripts. The Java source code files used in this exercise are located in <lab_root>/dtrace_javanative/exercises/ex3.

    Background Information

    The ThreadMemoryWorkingObject Java class shown in Listing 12 does a loop of work accepting a business request and processing it in a running thread.

    Refer to<lab_root>/dtrace_javanative/exercises/ex3/src/nativememory/ThreadMemoryWorkingObject.java.

    public class ThreadMemoryWorkingObject extends DefaultWorkingObject{
    
        @Override
        public void loopWork() {
            try {
                BusinessRequest request = RequestAccept.accept();
                RequestProcess.process(request);
                Thread.sleep(200);
            } catch (InterruptedException ex) {
              ex.printStackTrace();
            }
        }
    }
    

    Listing 12

    The RequestProcess class shown in Listing 13 is a business request processor that processes each request by starting a dedicated thread.

    Refer to <lab_root>/exercises/ex2/src/nativememory/threadcreation/RequestProcess.java.

    public class RequestProcess {
        static public void process(final BusinessRequest request) {
            new Thread(new Runnable() {
                public void run() {
                    System.out.println(request.getRequestString());
                    try {
                        Thread.sleep(300000);
                    } catch (Exception e) {
                        System.err.println(e);
                    }
                }
            }).start();
        }
    }
    

    Listing 13

    Exercise 3, Step 1: Run a Request Processing Program that Causes Native Memory Leaks

    1. Go to the directory of <lab_root>/dtrace_javanative/exercises/ex3, and execute the ant command, as follows:

      ouser@solaris11:~$ cd dtrace_javanative/exercises/ex3
      ouser@solaris11:~/dtrace_javanative/exercises/ex3$ ant
      
    2. The ant command runs a program consisting of the Java classes mentioned in Listing 12 and Listing 13. This program accepts and processes incoming requests. The processing logic is to create a dedicated Java thread for each request, which continuously generates new Java threads and releases them later. 

      After running for a while (about one minute, depending on your environment), the program might generate the output shown in Listing 14 or something similar.

      You can stop this program by pressing the Enter key at any time, and you can restart it by running ant again.

      [java] requestID is: 6768
      [java] requestID is: 7425
      [java] requestID is: 2676
      [java] requestID is: 1701
      [java] requestID is: 1624
      [java] requestID is: 7856
      [java] requestID is: 7230
      [java] requestID is: 2313
      [java] requestID is: 3956
      [java] requestID is: 9852
      [java] Exception in thread "pool-1-thread-2" java.lang.OutOfMemoryError: unable to create new native thread
      [java]     at java.lang.Thread.start0(Native Method)
      [java]     at java.lang.Thread.start(Thread.java:693)
      [java]     at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:949)
      [java]     at java.util.concurrent.ThreadPoolExecutor.processWorkerExit(ThreadPoolExecutor.java:1017)
      [java]     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1163)
      [java]     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      [java]     at java.lang.Thread.run(Thread.java:724)
      

      Listing 14

      From the output in Listing 14, you can see the OutOfMemoryError exception thrown by the JVM. Now, let's try to find out what caused the error.

    Exercise 3, Step 2: Monitor the Program with JDK Tools

    1. After getting the exception, press the Enter key to stop the program, and then restart it by running ant again. Keep the program running, and use the jps tool to find the process ID for this Java program. You will get the output like Listing 15:

      ouser@solaris11:~$ jps -v
      1839 App -Djava.security.policy=/usr/share/vpanels/java.policy
      3641 Jps -Dapplication.home=/home/gary/sf_OOW/jdk1.7.0_25 -Xms8m
      3640 OOW -DDescription=OOWDtraceNativeLab_ex3 -Xms128m -Xmx128m -XX:MaxPermSize=128m -XX:PermSize=128m -Xss2m
      3638 Launcher -Dant.home=/home/gary/sf_OOW/apache-ant-1.9.2 -Dant.library.dir=/home/gary/sf_OOW/apache-ant-1.9.2/lib
      

      Listing 15

      By using jps -v, we can find each Java processes with its process ID and startup arguments. From the output in Listing 15, we can see that the process ID of the Java program on my machine is 3640 (it might be different on your machine), because it has a name of OOW with other arguments that can help us to differentiate it from other Java processes.

    2. Use the prstat -p <pid> system command to monitor the memory usage of this process. You will get output similar to Listing 16:

      PID  USERNAME  SIZE   RSS STATE   PRI NICE   TIME    CPU  PROCESS/NLWP      
      3640 ouser    276M   36M sleep    59    0   0:00:39 4.6% java/845
      

      Listing 16

      From the output in Listing 16, we see that the process with PID 3640 is consuming 276 MB virtual memory and this number is stable. Since such small memory usage should not cause OutOfMemoryError, we need to do further tracing.

    3. Use the jmap -heap <pid> command to monitor the Java heap usage of this program. (Use your process ID to replace <pid>).

      Heap Usage:
      PS Young Generation
      Eden Space:
         capacity = 34603008 (33.0MB)
         used     = 4815664 (4.5925750732421875MB)
         free     = 29787344 (28.407424926757812MB)
         13.916894161339963% used
      From Space:
         capacity = 5767168 (5.5MB)
         used     = 303200 (0.289154052734375MB)
         free     = 5463968 (5.210845947265625MB)
         5.2573464133522725% used
      To Space:
         capacity = 5767168 (5.5MB)
         used     = 0 (0.0MB)
         free     = 5767168 (5.5MB)
         0.0% used
      PS Old Generation
         capacity = 90177536 (86.0MB)
         used     = 0 (0.0MB)
         free     = 90177536 (86.0MB)
         0.0% used
      PS Perm Generation
         capacity = 134217728 (128.0MB)
         used     = 1881216 (1.7940673828125MB)
         free     = 132336512 (126.2059326171875MB)
         1.4016151428222656% used
      

      Listing 17

      From the output in Listing 17, we see that process with this PID has consumed about 33 MB (Eden) + 5.5 MB (From) +5.5 MB (To) + 86 MB (PS Old Generation) Java heap memory and 128 MB PermGen space, which is very small. There is no reason the program should be hitting the memory limit and causing OutOfMemoryError. Then, what on earth is causing the memory leaks in this Java process?

    Exercise 3, Step 3: Write a DTrace Program to Identify the Problem

    After getting OutOfMemoryError, from the output of prstat shown in Listing 16, we can see that the values in columnsSIZE and RSS are small, and there are about 845 threads in the process. This cannot explain why we cannot create new native threads.

    The memory usage of a process can be categorized into two categories: virtual memory usage and physical memory usage (RSS). The virtual memory size is the amount of virtual address space allocated to the process. The physical memory is the amount of real memory pages allocated to a process.

    1. Run the ps command to display the process's virtual and physical memory usage (use your own PID number to replace 3640):

      ouser@solaris11:~$ ps -eo pid,vsz,rss | grep 3640
       3640 282624 68608
      

      Listing 18

      The output shown in Listing 18 is the same as with prstat: virtual memory is about 276 MB and RSS is about 67 MB.

    2. We use the following ex3.d script to monitor the behavior of malloc and free for the process at a fixed time interval (refer to <lab_root>/dtrace_javanative/DScripts/ex3.d):
      pid$1::malloc:entry
      {
              @size["malloc size"] = quantize(arg0);
              @heap["total_size"]  = sum(arg0) ;
              count_time++ ;
      }
      
      pid$1::free:entry
      {
              count_time--;
      }
      
      profile:::tick-2sec
      {
              printf("(malloc-free) times is %d
      ", count_time);
              count_time =0 ;
              printa(@size) ;
              printa(@heap) ;
              trunc(@size) ;
              trunc(@heap) ;
      }
      
    3. Switch to the root user before trying DTrace commands:

      ouser@solaris11:~$ su
      Password: 
      root@solaris11:~#
      
    4. Run ex3.d with the PID. You will get output similar to Listing 19:

      root@solaris11:~/dtrace_javanative/DScripts# ./ex3.d 3640
      Tracing... Hit Ctrl-C to end.
      (malloc-free) times is 162
      
        malloc size                                       
                 value  ------------- Distribution ------------- count    
                     8 |                                         0        
                    16 |@@@@@@@@@@@                              50       
                    32 |@@@@                                     20       
                    64 |@@                                       10       
                   128 |@@@@@@@                                  30       
                   256 |@@@@@@@@@                                40       
                   512 |@@@@                                     20       
                  1024 |@@                                       10       
                  2048 |                                         0        
      
      
        total_size                                                    60400
      

      Listing 19

      The output in Listing 19 shows that the amount of memory used by malloc every two seconds (total_size) is usually about 60 K. Therefore, the program will use about 60 K x 300 = 18 MB of memory for malloc every 10 minutes. So we can say that OutOfMemoryError is not caused by malloc.

    5. Another way to check memory allocation is to use mmap (void *mmap(void *addr, size_t len, int prot, int flags, int fildes, off_t off)). The mmap() function establishes a mapping between a process's address space and a file or a shared memory object. The munmap() function does the reverse; it deletes the mapping relationship and releases the memory resources.

      So we next want to check the use of mmap and munmap for the process. The following script (ex3.1.d) can do this for you:

      syscall::mmap:entry
      /pid == $1 /
      {
         printf("calling mmap, with size %d
      ", arg1) ;
      }
      
      syscall::munmap:entry
      /pid == $1 /
      {
         printf("calling munmap(%d)
      ",arg1);
      }
      

      arg1 is the second argument of the mmap system call, which is the memory size.

    6. Run ex3.1.d with <pid>:

      root@solaris11:~/DScripts# ./ex3.1.d 3640
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      calling mmap, with size 2097152
      

      Listing 20

      After monitoring this process for a while, we notice in Listing 20 that there are no munmap calls, but there are continuous mmap calls with a size of 2 MB. This might be the cause of OutOfMemoryError.

    7. To verify this, we use the following ex3.2.d script to check the frequency of mmap calls and the mmap call stack every four seconds.

      syscall::mmap:entry
      /pid == $1/
      {
              @mmap_num["number of mmaps"] = count();
              @mmap_size["size of mmaps"]  = quantize(arg1) ;
              @mmap_path[ustack()] = count() ;
      }
      
      profile:::tick-4sec
      {
              printa(@mmap_num);
              printa(@mmap_size);
              printa(@mmap_path);
      
              clear(@mmap_num);
              clear(@mmap_size);
              clear(@mmap_path);
      }
      

      The quantize aggregate function is used to show the frequency distribution of the memory size.

    8. Run ex3.2.d with <pid>. The output will look like Listing 21:

      root@solaris11:~/DScripts# ./ex3.2.d 3640
      
        number of mmaps                                                  28
      
        size of mmaps                                     
                 value  ------------- Distribution ------------- count    
               1048576 |                                         0        
               2097152 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 28       
               4194304 |                                         0        
      
      
      
                    libc.so.1`mmap+0x15
                    libc.so.1`_thrp_create+0xd2
                    libc.so.1`thr_create+0x32
                    libjvm.so`__1cCosNcreate_thread6FpnGThread_n0AKThreadType_I_b_+0x29d
                    libjvm.so`__1cKJavaThread2t5B6MpFp0pnGThread__vI_v_+0x4c1
                    libjvm.so`JVM_StartThread+0x133
                    0xfb4af9d0
                    0xfb4afcb3
                    0xfb403ab9
                    0xfb40353b
                    0xfb403ab9
                    0xfb4003d1
      

      Listing 21

      There are 28 mmap calls every four seconds. The size of each call is about 2 MB. According to the user stack, the mmapcalls originated from thr_create, which creates a new OS thread. Thus, when the program creates a new thread, at least 2 MB memory will be mapped to the process's address space. Since this is a 32-bit JVM, the maximum address space is 4 GB, including kernel space and other address spaces for code segments, shared libraries, and so on. It seems that this is the cause for OutOfMemoryError.

    9. Next, we will trace the thr_create system call to check how each system call consumes 2 MB of address space.thr_create is defined as follows:

      int thr_create(void *stack_base, 
                          size_t stack_size, 
                          void *(*start_func) (void*), 
                          void *arg, 
                          long flags, 
                          thread_t *new_thread_ID);
      

      The following script (ex3.3.d) shows the behavior of the thr_create call for a process by checking the distribution ofstack_size (arg1) of the newly created threads.

      #!/usr/sbin/dtrace -qs 
      
      pid$1::thr_create:entry
      {
         @thread_num["number of threads"] = count();
         @thread["thread_stack"] = quantize(arg1); 
      }
      profile:::tick-4sec
      {
              printa(@thread_num);
              printa(@thread);
              clear(@thread_num);
              clear(@thread);
      }
      
    10. Run ex3.3.d with <pid>:

      root@solaris11:~/DScripts# ./ex3.3.d 3640
      
        number of threads                                                 184
      
        thread_stack                                      
                 value  ------------- Distribution ------------- count    
               1048576 |                                         0        
               2097152 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 184      
               4194304 |                                         0        
      

      Listing 22

      From the output in Listing 22, we can see that 2 MB is the stack size for every newly created thread. So the program will throw OutOfMemoryError after creating about 900 threads, because the total virtual address space of this Java process will reach the limitation of 4 GB of memory.

    11. So, why didn't the RSS and SIZE fields in the output of prstat (Listing 16) reflect the memory usage? Let's turn to mmapfor an explanation by using the following script (ex3.4.d) to check the argument of mmap:

      syscall::mmap:entry
      /pid == $1 /
      {
         printf("calling mmap, with flag set to %d
      ", arg3) ; 
         /*arg3 is the 4th argument, which is flag*/
          }
      

      Run ex3.4.d with <pid>. The output will be like this:

      root@solaris11:~/DScripts/ex3# ./ex3.4.d 3640
      calling mmap, with flag set to 322
      calling mmap, with flag set to 322
      calling mmap, with flag set to 322
      calling mmap, with flag set to 322
      calling mmap, with flag set to 322
      calling mmap, with flag set to 322
      ^C
      

      By consulting the man page of mmap, we know that the flag value is the bitwise inclusive OR of the options that are defined in </usr/include/sys/mman.h>. That is, the flag value (322) equals MAP_PRIVATE | MAP_ANON | MAP_NORESERVE.

      Since each mmap with MAP_NORESERVE will not allocate physical memory, the memory usage cannot be seen from the output of prstat. However, mmap system calls consume the virtual memory space, which is limited (about 4 GB) in a 32-bit process. And this leads to native memory leaks.

      About Java Thread Stack Size
      In a Java application, each thread has its own stack. The stack is used to hold return addresses, function/method call arguments, and so on. As a result, if a thread tends to process larger structures via recursive algorithms, it will need a larger stack. The -Xss JVM option is used to define the limit of the Java thread stack size, which also indirectly limits how deep a stack can get. 

      When a Java thread is created, the defined size of virtual memory is reserved; that is, the address space is reserved. The actual memory is allocated in pages and they are allocated only when used. 

      If the -Xss option is not defined, the default size will be used. This default size is platform-specific:

      • Oracle Solaris SPARC 32-bit JVM: 512 KB
      • Oracle Solaris SPARC 64-bit JVM: 1024 KB
      • Oracle Solaris x86 32-bit JVM: 320 KB
      • Linux AMD64 64-bit JVM: 1024 KB

      In this exercise, we defined the stack size to be 2 MB. Please refer to the build file<lab_root>/dtrace_javanative/exercises/ ex3/build.xml, where you will find that this runtime parameter is defined as follows:

      <jvmarg value="-Xss2m"/>

      Please feel free to modify it to see the different effects.

    Exercise 3, Step 4: Modify the Program and Run It Again

    1. Stop the program by pressing Enter in the console that was started in Exercise 3, Step 1.
    2. Use an editor to open the file<lab_root>/dtrace_javanative/exercises/ex3/src/nativememory/threadcreation/RequestProcess.java.
    3. Modify the code line highlighted in bold in Listing 23. This code will change the sleep time from 30 seconds to 30 ms, which will release the thread resource more frequently.

      public class RequestProcess {
          static public void process(final BusinessRequest request) {
              new Thread(new Runnable() {
                  public void run() {
                      System.out.println(request.getRequestString());
                      try {
                          Thread.sleep(30);
                      } catch (Exception e) {
                          System.err.println(e);
                      }
                  }
              }).start();
      
          }
      }
      

      Listing 23

    4. Repeat Exercise 3, Step 1 through Exercise 3, Step 3 to see the difference.

    Exercise 4: Identifying and Solving "Direct ByteBuffer Memory Leak" Problems (40 Minutes)

    In Java, almost all the objects exist in the heap. Usually, the Java heap is managed by the garbage collector. Automatic garbage collector makes Java one of the most productive languages. However, automatic garbage collection also causes unpredictable halt time for Java processes; even the latest garbage collection algorithm will pause the JVM for a relatively long time (varying from several seconds to tens of seconds) if more than a 10 GB heap size is used. Therefore, even though many servers are equipped with dozens of gigabytes (even terabytes) of RAM, we still prefer to use only a few gigabytes of them because of the possible garbage collection performance hits.

    With the introduction of the direct ByteBuffer in Java's new I/O (NIO) package, memory can be allocated without being tracked by the garbage collector, because it can be available to native code for tasks such as avoiding the copying of data to and from the kernel for I/O. Java HotSpot VM has a "back door" that provides a number of low-level operations for manipulating memory directly. This back door—sun.misc.Unsafe—is widely used by the JDK itself in packages such asjava.nio and java.util.concurrentUnsafe allows us to allocate and deallocate memory explicitly via theallocateMemory and freeMemory methods. Allocated memory is not under management by GC and is not limited to the maximum JVM heap size.

    Functionalities based on NIO's off-heap buffers have been widely used among popular products such as Oracle Coherence, Terracotta BigMemory, and Apache DirectMemory. Obviously, the drawback of this approach is that you have to take responsibility for memory management yourself, which might lead to native memory leaks.

    In this exercise, we will use a DTrace script to detect the native memory leaks caused by "NIO DirectBuffer" management.

    The D-script files used in this exercise are located in <lab_root>/dtrace_javanative/DScripts. The Java source code files used in this exercise are located in <lab_root>/dtrace_javanative/exercises/ex4.

    Background Information

    The following Java class (QueueWriterWorkingObject) runs a thread that does a loop of work for creating a data structure and putting it into a queue. The queue data is stored in DirectBuffer by writing the data structure into the location pointed to by the memory address. (Refer to<lab_root>/dtrace_javanative/exercises/ex4/src/nativememory/QueueWriterWorkingObject.java.)

    public class QueueWriterWorkingObject extends DefaultWorkingObject{
    
        public static long idIncreatment = 0;
        public static QueueDataAccess qda = new QueueDataAccess();
    
        @Override
        public void loopWork() {
            try {
                qda.setObjectOffset(NativeMemoryManager.createOneMessageSpace());
                qda.setMessageId(idIncreatment);
                qda.setMessageType('B');
                qda.setReceiverId(idIncreatment+100);
                qda.setSenderId(idIncreatment+300);
                qda.setMessage("hello,OOW");
                idIncreatment++;
                Thread.sleep(20);
            } catch (InterruptedException ex) {
              ex.printStackTrace();
            }
        }
    
    }
    

    The following Java class (QueueReaderWorkingObject) runs a thread that does a loop of work for consuming messages from a queue. The queue message is read from DirectBuffer by locating the memory address of the data structure. (Refer to<lab_root>/dtrace_javanative/exercises/ex4/src/nativememory/QueueReaderWorkingObject.java.)

    public class QueueReaderWorkingObject extends DefaultWorkingObject{
     
        public static long idIncreatment = 0;
        public static QueueDataAccess qda = new QueueDataAccess();
    
        @Override
        public void loopWork() {
            try {
                Thread.sleep(20);
                long offset = NativeMemoryManager.getMessageAddress(idIncreatment);
                if (offset >= 0) {
                    qda.setObjectOffset(NativeMemoryManager.getMessageAddress(idIncreatment));
                    long messageid = qda.getMessageId();
                    char messageType = qda.getMessageType();
                    long receiverid = qda.getReceiverId();
                    long senderid = qda.getSenderId();
                    byte[] message = qda.getMessage();
                    idIncreatment++;
                    System.out.println("we got messageid: " + messageid + " with type " + messageType 
                      + " from user "+ senderid + " to user " + receiverid + " with message: " 
                      + new String(message));
                    NativeMemoryManager.destroyMessage(idIncreatment);
                } else {
                    Thread.sleep(10);
                }
                
                
            } catch (InterruptedException ex) {
              ex.printStackTrace();
            }
        }
    
    }
    

    Exercise 4, Step 1: Run a Program that Produces and Consumes Queue Messages and Causes Native Memory Leaks

    1. Go to the directory <lab_root>/dtrace_javanative/exercises/ex4, and execute the ant command, as follows:

      ouser@solaris11:~$ cd dtrace_javanative/exercises/ex4
      ouser@solaris11:~/dtrace_javanative/exercises/ex4$ ant
      
    2. The ant command runs a program that prints the queue message reader's log information. After running for a while, the program might produce output similar to the following. You can stop this program by pressing the Enter key at any time, and you can restart it by running ant again.

      [java] we got messageid: 0 with type B from user 300 to user 100 with message: hello,OOW
      [java] we got messageid: 1 with type B from user 301 to user 101 with message: hello,OOW
      [java] we got messageid: 2 with type B from user 302 to user 102 with message: hello,OOW
      [java] we got messageid: 3 with type B from user 303 to user 103 with message: hello,OOW
      [java] we got messageid: 4 with type B from user 304 to user 104 with message: hello,OOW
      [java] we got messageid: 5 with type B from user 305 to user 105 with message: hello,OOW
      [java] we got messageid: 6 with type B from user 306 to user 106 with message: hello,OOW
      [java] we got messageid: 7 with type B from user 307 to user 107 with message: hello,OOW
      [java] we got messageid: 8 with type B from user 308 to user 108 with message: hello,OOW
      [java] we got messageid: 9 with type B from user 309 to user 109 with message: hello,OOW
      [java] we got messageid: 10 with type B from user 310 to user 110 with message: hello,OOW
      [java] we got messageid: 11 with type B from user 311 to user 111 with message: hello,OOW
      [java] we got messageid: 12 with type B from user 312 to user 112 with message: hello,OOW
      [java] we got messageid: 13 with type B from user 313 to user 113 with message: hello,OOW
      

    Exercise 4, Step 2: Monitor the Program with JDK Tools

    1. While running the program started in Exercise 4, Step 1, use the jps tool to find the PID of the program. You should get output similar to Listing 24:

      ouser@solaris11:~$ jps -v
      1839 App -Djava.security.policy=/usr/share/vpanels/java.policy
      3952 Launcher -Dant.home=/home/gary/sf_OOW/apache-ant-1.9.2 -Dant.library.dir=/home/gary/sf_OOW/apache-ant-1.9.2/lib
      3955 Jps -Dapplication.home=/home/gary/sf_OOW/jdk1.7.0_25 -Xms8m
      3954 OOW -DDescription=OOWDtraceNativeLab_ex4 -Xms128m -Xmx128m -XX:+ExtendedDTraceProbes
      

      Listing 24

      By using jps -v, we can find each Java process with its PID and startup arguments. From the output in Listing 24, we can tell that the PID of the Java program on my machine is 3954 (it might be different on your machine), because it has a name of OOW with other arguments.

    2. Use the prstat -p <pid> system command to monitor the memory usage of this process, as shown in Listing 25.

       PID USERNAME  SIZE    RSS STATE   PRI NICE      TIME  CPU PROCESS/NLWP      
      3954 ouser      201M   29M sleep    59    0   0:00:03 2.5% java/12
      

      Listing 25

      From the output shown in Listing 25, we can see that the process with PID 3954 is consuming about 201 MB of virtual memory and this number is still increasing.

    3. Use the jmap -heap <pid> command to monitor the Java heap usage of this program (you can use your PID to replace <pid>). 

      Heap Usage:
      PS Young Generation
      Eden Space:
         capacity = 34630848 (33.3125MB)
         used     = 4848624 (4.564041137695312MB)
         free     = 29702224 (28.34845886230469MB)
         14.000426185098354% used
      From Space:
         capacity = 5762896 (5.5MB)
         used     = 0 (0.0MB)
         free     = 5762896 (5.5MB)
         0.0% used
      To Space:
         capacity = 5762896 (5.5MB)
         used     = 0 (0.0MB)
         free     = 5762896 (5.5MB)
         0.0% used
      tenured generation:
         capacity = 90178816 (86.0MB)
         used     = 0 (0.0MB)
         free     = 90178816 (86.0MB)
         0.0% used
      Perm Generation:
         capacity = 12582912 (12.0MB)
         used     = 1869608 (1.7829971313476562MB)
         free     = 10713304 (10.217002868652344MB)
         14.858309427897135% used
      
      760 interned Strings occupying 47136 bytes.
      

      Listing 26

      From the output in Listing 26, we see that the process with PID 3954 had consumed about 128 MB of heap memory and 12 MB of PermGen space, which is less than the result of the prstat output shown in Listing 25. So there must be some native memory leaks in this Java process.

    Exercise 4, Step 3: Use DTrace to Trace the Problem

    Now we will explore how to use the Java HotSpot VM provider to analyze a Java problem and quickly locate the root cause in the Java source code.

    1. In the output of prstat shown in Listing 25, the values in both the SIZE and RSS fields are increasing. Therefore, we need to observe the malloc and mmapcalls of this Java process by using ex4.d:

      pid$1::malloc:entry
      {
         printf("calling malloc(%d) %d times
      ",arg0,malloc++);
      }
      
      pid$1::free:entry
      {
         printf("calling free %d times
      ",free++);
      }
      
      syscall::mmap:entry,
      syscall::munmap:entry
      /pid == $1/
      {
         printf("calling %s with size %d
      ",probefunc, arg1) ;
      }
      
    2. Switch to the root user before trying any DTrace commands:

      ouser@solaris11:~$ su
      Password: 
      root@solaris11:~#
      
    3. Run ex4.d with the PID. You will get output similar to Listing 27:

      root@solaris11:~/dtrace_javanative/DScripts# ./ex4.d 3954
      Tracing... Hit Ctrl-C to end.
      calling malloc(2003000) 0 times
      calling malloc(2003000) 1 times
      calling malloc(2003000) 2 times
      calling malloc(2003000) 3 times
      calling malloc(2003000) 4 times
      calling malloc(2003000) 5 times
      calling malloc(2003000) 6 times
      calling malloc(2003000) 7 times
      calling malloc(2003000) 8 times
      calling malloc(2003000) 9 times
      calling malloc(2003000) 10 times
      ^C
      

      Listing 27

      As you can see in Listing 27, there is no mmap call. This process uses only malloc for memory allocation with a fixed size of about 2 MB and without calling free. Therefore, it is the malloc calls that continuously eat up memory.

    4. Now we will use the ustack function to check the user-level stack when malloc is being called. Refer to the following script (ex4.1.d):

      pid$1:libc:malloc:entry
      {
         @stack_count[ustack(),arg0] = count() ;
      }
      
    5. Run ex4.1.d with the PID. You will get the output similar to Listing 28:

      root@solaris11:~/dtrace_javanative/DScripts# ./ex4.1.d 3954
      ^C
      
      
                    libc.so.1`malloc
                    libjvm.so`__1cCosGmalloc6FI_pv_+0x2d
                    libjvm.so`Unsafe_AllocateMemory+0x17e
                    0xfa20ab12
                    0xfa262960
                2003000                12
      

      Listing 28

      As you can see in Listing 28, all of the malloc calls have a size of about 2 MB and occurred 12 times during the running time of this script. And all of these malloc calls come from Unsafe_AllocateMemory. Therefore, we can easily locate the root cause in the Java source code.

      However, the real world is not that simple. Usually there are lots of malloc calls from different segments of source code. As a result, the output of running ex4.1.d will be unreadable and it will be difficult for us to locate the cause.

    6. Actually we have a better choice—to use the DTrace hotspot provider—which provides probes to track the lifespan of the VM, thread start and stop events, GC memory pool statistics, method compilations, and monitor activity. With a startup flag, additional probes—such as method-entry probes and method-return probes—are enabled that can be used to monitor the running Java program and object allocations.

      Now we will use method-entry and method-return probes for tracking this Java process. These probes can provide a fine-grained examination of Java thread execution. They will fire any time a method is entered or returned. The method-entry and method-return probes take the following arguments:

      • args[0]: The Java thread ID of the thread that is entering or leaving the method
      • args[1]: A pointer to UTF-8 string data that contains the name of the class of the method
      • args[2]: The length of the class name data (in bytes)
      • args[3]: A pointer to UTF-8 string data that contains the name of the method
      • args[4]: The length of the method name data (in bytes)
      • args[5]: A pointer to UTF-8 string data that contains the signature of the method
      • args[6]: The length of the signature (in bytes)

      The following script (ex4.2.d) uses the probes mentioned above to find out which classes and methods makemalloc calls.

      hotspot$1:::method-entry
      / stringof(copyin(arg1,6) )== "native"/
      {
              self->class = stringof(copyin(arg1, arg2 + 1));
              self->class[arg2] = '';
              self->method = stringof(copyin(arg3, arg4 + 1));
              self->method[arg4] = '';
      }
      
      hotspot$1:::method-return
      / self->class != NULL /
      {
         self->class = 0 ;
         self->method = 0 ;
      }
      
      pid$1::malloc:entry
      / self->class != NULL /
      {
         printf("calling malloc(%d) in %s.%s
      ",arg0,self->class,self->method);
         @malloc_size[self->class,self->method] = sum(arg0) ;
         @malloc_dist[self->class,self->method] = quantize(arg0) ;
      }
      
      dtrace:::END
      {
              printf("
      Java malloc byte distributions by class and method,
      
      ");
              printa("   %s, %s, bytes total = %@d %@d
      ", @malloc_size,
                  @malloc_dist) ;
      }
      

      We added / stringof(copyin(arg1,6) )== "native"/ in the predicate field in order to focus on the classes of interest without being distracted by other irrelevant pieces of code.

    7. Run ex4.2.d with the PID. You will get the output similar to Listing 29:

      root@solaris11:~/dtrace_javanative/DScripts# ./ex4.2.d 3954
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      calling malloc(2003000) in nativememory/DataAccess/NativeMemoryManager.createOneMessageSpace
      ^C
      
      Java malloc byte distributions by class and method,
      
         nativememory/DataAccess/NativeMemoryManager, createOneMessageSpace, bytes total = 26039000 
                 value  ------------- Distribution ------------- count    
                524288 |                                         0        
               1048576 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 13       
               2097152 |                                         0        
      

      Listing 29

      As you can see in Listing 29, all of the malloc calls are called by the createOneMessageSpace method of theNativeMemoryManager class.

    Exercise 4, Step 4: Modify the Program and Run It Again

    1. Stop the program by pressing the Enter key in the console that was started in Exercise 4, Step 1.
    2. Use an editor to open the file<lab_root>/dtrace_javanative/exercises/ex4/src/nativememory/DataAccess/NativeMemoryManager.java.

      In this file, we can find that NativeMemoryManager allocates one page of memory space for every 100 records.

    3. Add the code lines highlighted in bold in Listing 30. This code will free a native memory page if all of the 100 records in it are consumed. 

      public class NativeMemoryManager {
          
          public static int recordPerPage = 100;
          public static volatile long nextMessageid;
          public static long[] memoryPagesOffset = new long[1000];
          public static final long recordsize = QueueDataAccess.getObjectSize();
          public static final long requiredHeap = recordPerPage * QueueDataAccess.getObjectSize();
          
          
          
          public static long getMessageAddress(final long messageid){
              if (messageid>nextMessageid-2) return -1;
              long reminder = messageid%recordPerPage;
              int index = (int) ((messageid-reminder)/recordPerPage);
              return memoryPagesOffset[index]+reminder*recordsize;
          }
          
          
          synchronized public static long createOneMessageSpace(){
              long reminder = nextMessageid%recordPerPage;
              int index = (int) ((nextMessageid-reminder)/recordPerPage);
              if(reminder ==0) {
                  long address = Util.unsafe.allocateMemory(requiredHeap);
                  memoryPagesOffset[index]=address;
              }
              
              nextMessageid++;
              return memoryPagesOffset[index]+reminder*recordsize;
              
          }
          
          public static void destroyMessage(final long messageid) {
              long reminder = messageid%recordPerPage;
              int index = (int) ((messageid-reminder)/recordPerPage);
              if (reminder == 0 && index >1) Util.unsafe.freeMemory(memoryPagesOffset[index-1]);
              
          }
          
      }
      

      Listing 30

    4. Repeat the process from Exercise 4, Step 1 through Exercise 4, Step 3 to see the difference.

    Summary

    You have successfully completed the "How to Use DTrace to Troubleshoot Java Native Memory Problems on Oracle Solaris 11" hands-on lab! Starting by learning basic concepts, we used a set of powerful DTrace scripts to solve different Java native memory problems. The usage of predicates, aggregations, and the pidprofile, and hotspot providers were also demonstrated.

    See Also

    Here are some additional resources:

    About the Authors

    Wang Yu presently works for Oracle's ISV Engineering group as an Oracle Solaris (Java) technology engineer and technology architecture consultant. His duties include supporting local independent software vendors (ISVs) and evangelizing about and consulting on important Oracle Solaris and Java technologies.

    Wensheng (Vincent) Liu is a technical consultant in Oracle's ISV Engineering group. His responsibility includes providing technical support to local ISVs and evangelizing about and consulting on important Oracle Solaris and system-related technologies.

    Gang (Gary) Wang, manager of Oracle's ISV Engineering group, leads ISV Engineering teams in Beijing helping ISVs in China, Japan, and Korea to make their applications run best on Oracle Solaris, Oracle servers, and Oracle storage.

    Revision 1.0, 02/25/2014

    Follow us:
    Blog | Facebook | Twitter | YouTube

  • 相关阅读:
    ASP.NET MVC中防止跨站请求攻击(CSRF)
    C#操作JSON学习
    C# 产生随机密码
    博客园上好的技术系列收藏
    OWIN学习
    bzoj1068: [SCOI2007]压缩
    bzoj1012: [JSOI2008]最大数maxnumber
    bzoj1055: [HAOI2008]玩具取名
    bzoj1011: [HNOI2008]遥远的行星
    bzoj1008: [HNOI2008]越狱
  • 原文地址:https://www.cnblogs.com/zengkefu/p/5632990.html
Copyright © 2011-2022 走看看