Chinaunix首页 | 论坛 | 博客
  • 博客访问: 1158184
  • 博文数量: 56
  • 博客积分: 1682
  • 博客等级: 上尉
  • 技术积分: 719
  • 用 户 组: 普通用户
  • 注册时间: 2008-12-21 17:29
文章分类
文章存档

2013年(1)

2012年(11)

2011年(44)

分类: Java

2011-06-15 13:38:07

© 2001 HEWLETT-PACKARD COMPANY
performance tuning for
Java on hp-ux
getting started
Hewlett Packard Company,
September 25th, 2001
© 2001 HEWLETT-PACKARD COMPANY 2
t a b l e o f c o n t e n t s
INTRODUCTION................................................................................................................................. 3
USING A METHODICAL APPROACH............................................................................................... 4
AVOID MAKING AN UNINFORMED GUESS AT THE PROBLEM CAUSE.................................. 5
TESTING A SMALLER REPRESENTATIVE SAMPLE OF A BIG APPLICATION ...................... 5
ASSESSING THE SYSTEM AS A WHOLE......................................................................................... 6
DATA GATHERING............................................................................................................................ 7
HP-UX KERNEL PARAMETERS........................................................................................................ 8
GARBAGE COLLECTION................................................................................................................. 11
SPACES WITHIN HEAP MEMORY................................................................................................. 15
THREADS, LOCKING AND CONTENTION.................................................................................... 20
GETTING A THREAD DATA DUMP USING KILL -3 .................................................................... 24
USING THE HPJMETER TOOL TO DETECT THREAD LOCKING PROBLEMS...................... 26
A FINAL WORD ON THREADS........................................................................................................ 27
EXPENSIVE METHOD CALLS......................................................................................................... 28
MEMORY LEAKS .............................................................................................................................. 30
BENCHMARKING SYMPTOMS....................................................................................................... 33
CONCLUSIONS ................................................................................................................................. 35
REFERENCES.................................................................................................................................... 37
TOOLS .......................................................................................................................................... 37
Performance Analysis Tools............................................................................................................ 37
Load Testing Tools ......................................................................................................................... 37
© 2001 HEWLETT-PACKARD COMPANY 3
p e r f o r m a n c e t u n i n g f o r J a v a o n h p -
ux
getting started
introduction
This guide is intended to give the reader a set of useful beginning steps in troubleshooting Javarelated
performance issues and help with tuning Java for performance on HP-UX.
This document is not a full treatment of the subject, but instead gives a starting point for work in the
area. There are several books and references on the subject of Java performance that are mentioned
in the reference section and which should be consulted for more detail. It is written to be compatible
with the Java Software Developer Kit (SDK) 1.3.1 which contains the HotSpot Runtime Compiler
version 1.3.1, and subsequent versions.
Many of the subjects discussed here are treated in greater detail at
(search for “Java Performance Tuning on HP-UX”)
or more directly
http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,,1602!0!,00.html
In many cases where performance problems have been reported, an investigation into the behavior
of the garbage collector has shown a problem with the heap configuration for a particular JVM run.
On other occasions, the application’s thread locking, or a memory leak, has caused the problem. We
would advise the beginner that a combination of the following approaches
?? use of the Glance/gpm tool, to highlight bottlenecks;
?? analysis of the output of the –Xverbosegc (verbose report from the garbage collector) JVM
option;
?? analysis using the HPjmeter tool, of the output from –Xeprof (extended profiling) JVM
option;
?? walking through several outputs from the kill –3 command,
will generate useful data to perform a thorough initial analysis of many problems. This paper will go
into these areas further.
© 2001 HEWLETT-PACKARD COMPANY 4
The four actions described above, along with a full description of the hardware setup, operating
system and JVM version and options will be the first steps for the performance engineer who is
embarked on resolving a problem.
Glance/gpm is an HP-UX add-on tool that is found in the Applications CD and in the Enterprise
and Mission-critical bundles for HP-UX. Other tools, such as “top”, “vmstat”, “netstat” and “sar”
provide similar performance information. These are documented in the book entitled “HP-UX
Performance Tuning” by Sauers and Weygant [Sauers]. For the purposes of this paper, we will
concentrate on using Glance/gpm for system and process level monitoring.
using a methodical approach
There are many variables at play in Java performance analysis. Everything from the HP-UX kernel
parameters, to the options specified at runtime to the JVM, to the design of the application, has an
effect. For this reason, it is important to analyze a performance problem methodically from the top
down (from the outermost system view) and to consider all possibilities on the first approach. By a
top-down approach we mean consideration of the system as a whole first (as a collection of one or
more computers acting in concert, with networking effects between them and possibly database
access effects at certain points).
Having considered all variables in the entire system, we then drill down to a single computer and
further to a single process within that computer as the culprit for the problem. The reason for doing
this top-down analysis is that the cause of a performance problem can lie anywhere in the collection
of programs, computers, data sources and networks that make up the full system. There are tools to
help us with analysis of each computer in turn – one of the most powerful is the HP-UX tool
Glance/gpm, which will be mentioned again later in this document.
In the past, analysis work done by the R & D lab found that the reason for performance bottlenecks
may occur in non-Java technology. The performance engineer must consider this at all times.
The interchanges of data between the computers in the system and between the computers and their
external data sources should be examined. This can be done using network and database monitoring
tools such as the “netstat” tool.
The steps in the recommended process are to:
?? assess the overall system configuration, throughput and loading;
?? measure the performance (using tools which we will give further details on);
?? analyze data from performance measurement tools;
?? identify one or more possible bottlenecks;
?? change or tune one item at any one time;
?? measure the performance again to detect changes resulting from that tuning step.
© 2001 HEWLETT-PACKARD COMPANY 5
We go into some detail about these steps in further sections of this document. It is important also to
consider the data from more than one tool in the analysis process and to cross-check the output
between them. For example, the thread information produced by Glance/gpm should be consistent
with the thread data seen when a “kill –3 ” is issued to the JVM process.
It is also very useful to identify a single “unit of work” or “user transaction” that has a completion
time and is easily measurable, in order to gauge progress in performance tuning. This might be the
number of transactions per second put through a collection of software.
High throughput levels are not always the source of a performance issue. Many production systems
run with a high volume of transactions flowing through them constantly. The real question for the
performance engineer is “Where are the bottlenecks in the process and how can they be acted on ?”
In order to find these bottlenecks, it is sometimes necessary to place a load on the system which
represents the type of loading at peak user levels, when the system will be very busy. There are tools
for achieving this high loading on web based applications, which are given in the Tools section at the
end.
avoid making an uninformed guess at the problem cause
It is very tempting, especially when a project team is under pressure to solve a problem, to make an
initial stab at the problem cause without analysis and measurement. This is to be avoided. More often
than not, the initial guess we make is misleading.
Instead, it is recommended that the performance data be collected from a variety of tools
(Glance/gpm, HPjmeter, sar, the JVM options –Xverbosegc and –Xeprof and others) and once the
data from these tools is analyzed and understood, then a reason can be given for the symptoms seen.
This analysis takes some time – and typically the first hours of the performance engineer’s time is
spent in gathering and analyzing this data before making a change to one of the performanceinfluencing
variables of the system.
It is very important to have a notebook or journal of some kind, into which all the changes one
makes are recorded as we carry them out with their resulting effect on performance. The
performance engineer will be tempted to make several changes at once to the system, a course of
action to be avoided. Keeping track of the changes applied in written form defends the engineer
against the confusion that can easily arise from multiple changes, multiple results and many outside
influencing factors.
testing a smaller representative sample of a big application
A standard troubleshooting technique in computer programming is to boil the problem down to the
smallest possible piece of code that exhibits the symptom. This is fraught with danger, however, in
performance analysis, as the small sample can frequently behave quite differently to the real
production system. It is recommended that as much as possible, do the performance analysis on the
© 2001 HEWLETT-PACKARD COMPANY 6
“real” system, not a manufactured subset or representation of the larger system. This may cost time
for testing, but will pay off in the accuracy of the results achieved.
It is a fact that measurement tools, when looking at a system, influence performance themselves.
Glance/gpm, for example, is a process on an HP-UX system that will take up some of the CPU
cycles on that machine. What we look for in such a tool is the lowest possible intrusiveness we can
get. However, this intrusion is always present when we introduce a measurement tool.
assessing the system as a whole
Many web-based architectures today may have one set of computers for the web server tier, another
set of computers for the application server tier (hosting a layer of J2EE middleware, such as HP’s
Total-E-Server) and a database tier with Oracle. All of these are networked together and have
interactions for each customer request (initiated via a browser, for example).
The key tools for doing an overall system assessment are the Glance/gpm (graphical process
monitoring) tool and its equivalents (e.g. HP PerfView) which look at behavior of the operating
system and processes within an individual computer. Glance/gpm will give the engineer a visually
intuitive assessment of the state of various parts of that computer, such as its CPU load, its inputoutput
loading, its network traffic and its memory consumption rate.
Using the Glance/gpm tool on each HP-UX computer allows the engineer to see which one may be
under intense load and which is not, within the collection of machines that participate in the overall
system. The picture below shows the topmost view of the computer in the Glance/gpm window and
we can see that this particular computer has very high levels of “System time” being consumed and
very low levels of “User” time being used. This indicates a problem with this computer. We really
want higher CPU consumption in “User” time – which is time spent executing the application, rather
than in “System” time, which is CPU time spent executing operating system functions.
The cause of such a high rate of “System” time being used on a machine may be due to the operating
system calls, which deal with locking data structures called “mutexes”, taking a long time to resolve
conflicts – or the threads which are waiting for access to these structures being put to sleep. The
condition shown below could therefore be caused by thread lock contention in Java programs, a
topic we will delve into later in this document.
© 2001 HEWLETT-PACKARD COMPANY 7
For now, we know that too much time is being spent in the OS relative to the application code. The
next step is to investigate what each process on the machine is doing in detail.
Figure 1 : The main window of the Glance/gpm tool on HP-UX
Drilling down from the overall system view, to an individual computer and further to a unique
process is an exercise in using the Glance/gpm tool, along with the other HP-UX performance
analysis tools. These are described in detail in reference 1, [Sauers].
data gathering
The first action that we take in performance analysis is to write down the machine setup for each
computer that is in the path of execution of the user transaction (and thus is suspect). The
information we need is
?? the number of CPUs in the computer and the speed of those CPUs (number of CPUs can
be seen in Glance/gpm)
?? the amount of main memory (also visible in Glance/gpm)
?? the amount of disk space (occupied and available, the output from the “df” command)
© 2001 HEWLETT-PACKARD COMPANY 8
?? the values of the operating system tunable parameters (visible using the HPjconfig tool)
?? the patches which should be applied to the operating system (HPjconfig also helps here)
?? the version of the JVM being used (can be found using the “java –version” command)
?? the options to the JVM being used (e.g. –Xms, -Xmx or others).
An example of the Java version information, produced as output by the “java –version” command is
contained in listing 1, below.
Java version "1.3.1.00-release"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1.00-release-010607-
16:53-PA_RISC1.1)
Java HotSpot(TM) Server VM (build 1.3.1 1.3.1.00-release-010607-19:35-PA_RISC2.0
PA2.0, mixed mode)
Listing 1.The version identification of the Java runtime produced by the “Java –version” command
Each of the above values helps the performance engineer to gauge the performance
characteristics of Java programs on that machine. Several performance issues can be solved by
simply upgrading to the latest version of the Java SDK, updating the operating system to have
the latest patches that apply, or using the correct options to the Java runtime.
The performance engineer is advised to upgrade to the latest versions of the Java SDK and the
latest patches – to prove that these do or do not improve the situation. The HPjconfig tool,
described in the next section (“hp-ux kernel parameters”) describes this tool, which can also
check that the correct HP-UX patches have been applied.
Listing 2 shows a misuse of the Java Virtual Machine. This example shows the use of the older, less
optimized version of the Java runtime (“classic”), which is not recommended for server-side, longrunning
applications. The default Java runtime, (“HotSpot”) is the recommended one to use in this
case.
$ java –classic ClassName
Listing 2. Using an older, less efficient version of the Java runtime – this is to be avoided.
hp-ux kernel parameters
It is vitally important that the operating system be set up correctly for running your Java programs in
an optimal way. Several of the HP-UX tunable parameters contribute to this correct setup.
© 2001 HEWLETT-PACKARD COMPANY 9
The tool to use to understand what the best values for those tunable parameters are is HPjconfig,
which is itself a Java program. HPjconfig is a free program and is downloadable at
.
HPjconfig is invoked as shown in Listing 3.
$ java –cp HPjconfig.jar:HPjconfig_data.jar HPjconfig
Listing 3. Using the HPjconfig tool
The tool examines the model of computer it is run on, then allows the performance engineer to
decide from a collection of different application types, such as “Application Server”, “Web Server”
and “WAP Server”. It then makes informed recommendations on the values that should be applied
to certain HP-UX kernel tunable parameters. It can also store its output data in a file. These
recommended values are then applied to the operating system using the SAM system administration
tool, by the superuser or system manager. Figure 2 shows the initial screen for HPjconfig. At this
point, the tool checks that the patches applied to the current HP-UX installation are appropriate for
Java.
Figure 2 : The main window of the HPjconfig tool on HP-UX
© 2001 HEWLETT-PACKARD COMPANY 10
Figure 3 shows an example of the recommended kernel parameter values for a particular application
type, as seen in the HPjconfig tool .
Figure 3 : The HP-UX Kernel Parameters window of the HPjconfig tool .
Further detail on these parameters is available at the HP developer solution partner portal site,
http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,,1602!0!,00.html
It is important to note that the values shown in the “Recommended Tuned Values” column in the
HPjconfig screen above are just recommendations. When a computer has more than one application
process or database process running on it, the values that are optimal for the HP-UX kernel tunable
parameters will need to be balanced for the Java programs and the other programs that share that
machine.
The particular HP-UX kernel tunable parameters that are of most importance for Java programs are:
?? max_thread_proc – determines the maximum number of threads allowed in any one
process;
?? maxdsiz – determines the size of the data segment for any process;
?? maxfiles – specifies a soft limit on the maximum number of open files for any process;
?? maxfiles_lim – specifies a hard limit on the maximum number of open files for any process;
?? ncallout – determines the maximum number of pending timeouts associated with I/O;
© 2001 HEWLETT-PACKARD COMPANY 11
?? nfile - determines maximum number of open files on the computer system;
?? nkthread – determines the maximum number of kernel threads supported by the computer
system;
?? nproc – specifies the maximum number of processes allowed on the computer system
garbage collection
Programs in Java require objects, which take up memory to do their work. Those objects are created
both at the beginning of the program lifetime and possibly throughout its life. When a new object is
created in the Java language, the memory allocated for it is placed in an area called the heap.
This heap memory is of limited size. Its maximum size is set at 128 megabytes for any JVM run,
unless the user has adjusted the maximum using the JVM options –mx or –Xmx.
The following command will run the JVM with the default value for the heap size (no options
specified).
$ java ClassName
(where ClassName is the beginning class for the application)
Listing 3. Java executing with the default value of 128Mb maximum heap size
The command below limits this second JVM run to consume at most 240 megabytes for the heap.
$ java –Xmx240m ClassName
Listing 4. Java executing with a 240Mb maximum heap size
If this second program were to try to allocate memory for objects (create new objects) which
consumed more than 240Mb, then the JVM would produce an exception called
“OutOfMemoryException” and the program’s execution would stop.
The Java programmer, unlike the C or C++ programmer, does not deallocate or free this heap
memory explicitly (as the C/C++ programmer would do using the “delete” operator).
This is done by the JVM itself rather than by the programmer. The JVM detects that the program is
no longer using that object by ensuring that no part of the program has a reference to it. This
cleaning up of objects which are no longer in use is referred to as “garbage collection” and may be
viewed as a separate independent activity that happens at intervals in the life of the JVM. Those
intervals are determined by a need for more heap space, among other factors.
The significance of this memory management strategy is that once it has been decided that the
garbage collector part of the JVM should start, all other threads which are executing Java byte
codes, in interpreted or compiled form, must come to a safe point and stop for the duration of the
garbage collection process. This causes a dramatic effect on the performance of a Java program, if
it lasts for any significant period of time.
© 2001 HEWLETT-PACKARD COMPANY 12
For this reason, it is always important to study the behavior of the garbage collector in analyzing a
Java program’s performance.
Figure 4 shows the momentary effect of a garbage collection activity arising in the life of a single
JVM executing on an eight CPU computer. As we see, only one CPU is doing any serious work and
that is the garbage collector itself. The remaining CPUs are virtually unused.
Figure 4: A garbage collection as seen in the CPU screen in Glance/gpm tool on HP-UX on an 8 CPU system.
Excessive garbage collection activity is therefore a big cause of slowdowns in Java programs and
should be acted on by assessing and changing the options to the JVM that affect the heap, namely
the -Xms, -Xmx, -Xmn and –XX:SurvivorRatio options. The first step, however, is to detect the
occurrence of over activity on the part of the garbage collector within the JVM.
We have a good clue that garbage collection activity may be frequent in the Glance/gpm main
window. A spiky CPU behavior pattern, as shown in Figure 5 below, can indicate that we have
unwanted activity in the garbage collector. Figure 5 indicates that the user program time goes
through large drops when the garbage collector takes over the CPUs from it.
© 2001 HEWLETT-PACKARD COMPANY 13
Figure 5 : Spiky CPU behavior in the Glance/gpm tool on HP-UX
This detection work can be easily carried out with Java on HP-UX, since we have the “-Xverbosegc”
option to the JVM, which is used as follows.
$ java –Xverbosegc:file=myfile.out ClassName
Listing 5. running Java with -Xverbosegc
This causes the JVM to produce detailed garbage collection data in the file “myfile.out”, which is just
an example file name in this case.
NOTE : This option has low performance intrusion on the JVM at runtime (apart from that caused
by the writing of the output data to the disk) and so can be used on a production application, if
necessary.
The output data (in the file “myfile.out” in the above example) is tough to read in its original form as
it looks like the following:
1206120 50331648 4793888 4793888 4980736 0.124737 >
1206120 5058264 50331648 5032840 5032840 5242880 1.896397 >
6780520 50331648 5242688 5242688 5242880 0.870284 >
6780520 50331648 8720800 8720800 8916992 0.055405 >
© 2001 HEWLETT-PACKARD COMPANY 14
Listing 6. Output from the “–Xverbosegc” option to the JVM
There are nineteen columns of data in this output file, which describe various timings and regions
within the garbage collector’s heap consumption. The recommended action to take on this output is
to download the “processVerboseGC.awk” file from the performance tuning website:
http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,1602,00.ht
ml
and apply that command as follows:
$ cat myfile.out |awk –f processVerboseGC.awk > output.txt
Listing 7. Running the processVerboseGC..awk script on a file containing –Xverbosegc output
This command produces a far more user friendly output file, an example of which is given in listing 8
below:
GC: Full GC required - reason: Old generation expanded on last
scavenge
GC: Full 1.985317 s since last: 1.985317 s gc time: 96 ms
eden: 1834928->0/3670016 survivor: 120576->0/262144
tenure: 2 old: 3204992->2356088/3928064
GC: Scav 2.190710 s since last: 0.205393 s gc time: 4 ms
eden: 3669968->0/3670016 survivor: 0->120720/262144
tenure: 32 old: 2356088->2356088/3928064
Listing 8. Output from running the processVerboseGC122.awk script on a file of –Xverbosegc output
The data above describes three consecutive garbage collection events that occurred in sequence in
the life of this program. The times at which they occurred, since program start time, is given in
seconds after the words “Full” or “Scav”. Even before we understand the meaning of the terms
“eden”, “survivor”, “tenure” and “old” (which will be explained below) we can detect a lot of
information from the above. Firstly, a “Full” garbage collection (GC) event is always more expensive
in time consumed than a “Scavenge” or “Scav” garbage collection. If we see a lot of “Full GC”
entries, for example one every minute or more, then we will need to take some action.
Should we see an entry in this data that has the content:
“Full GC required – reason : call to System.gc() or Runtime.gc()”
then we know that either the user’s Java code or some library or infrastructure code being used in the
application is making an explicit call to the garbage collector. This should never be done. We can
cure this problem by removing the call to System.gc() or Runtime.gc() from the code, or by
overriding it with the JVM option –XX:+DisableExplicitGC, as follows:
© 2001 HEWLETT-PACKARD COMPANY 15
$ java –XX:+DisableExplicitGC ClassName
Listing 9. Running Java with an option to disable explicit calls to the garbage collector.
Secondly, we can see the time since the program started immediately after the “Full” or “Scav”
strings. This time (in seconds) accumulates over the life of the program and gives us a view of how
far into its execution this event occurred.
The “since last” string is followed by the amount of time since the last GC activity. The “gc time”
string indicates how long this particular GC event took to complete. We pay particular attention to
these last figures. In general terms, a healthy JVM will do fewer Full GCs than Scavenge GCs. The
Scavenge GCs will be separated from each other by about 0.5 seconds or more and will take no
longer than 300-400 milliseconds to complete.
If Scavenge GCs are taking a minute or more to complete, then we know that the program is
suspending the execution of those threads which are executing Java byte codes for at least a minute
and we have a performance problem. We should consider adjusting the JVM options that control the
heap memory usage to cure this problem. This is done by changing the values associated with the
JVM options –Xms, -Xmx and –Xmn described below. These values should be changed once we
understand what area of the heap memory is being overused or underused. This will be described in
the next section.
Full GCs can take a minute or more depending on the size of the heap space they have to perform
their collection over. In general, it is a good practice to configure the JVM to run such that Full GCs
can complete in a minute or less and are a relatively rare occurrence compared to Scavenge GCs. A
Full GC occurring every five to ten minutes would seem to indicate a healthy JVM run, along with
smaller, Scavenge GCs every 5-10 seconds, each having a short duration (300-400 milliseconds).
spaces within heap memory
This section goes into some depth on the subject of heap management. By understanding the
placement of new and old objects within the JVM’s overall heap memory, we can see how well or
badly that space is being used. We can then take action by means of the JVM options –Xms, -Xmx
and -Xmn to effect change on the behavior.
Figure 6, below, is a picture of the overall heap memory layout within the JVM.
© 2001 HEWLETT-PACKARD COMPANY 16
Figure 6: The structure of the heap memory used for objects in Java
In Figure 6 we see that there are four “spaces”, sometimes called “arenas”, within the overall heap
memory. The design intention of this “generational” type of garbage collection is that shorter lived
objects should spend all of their lifetime in the new arena and be garbage collected using a fast
algorithm. Long-lived objects should find their way into the “old” arena and remain there until a
slower garbage collection algorithm finds them and possibly cleans them out, if they are no longer in
use.
We do not show the remaining “Permanent” space within the heap space in Figure 6 above, as we
will not discuss it in this paper. We mention it here for completeness. The Permanent space in the
heap is used for storage of reflection objects, among others. Its default size is 64 megabytes in the
Java SDK 1.3.1. This default maximum size, added to the default New and Old area maximum sizes
(16 and 64 Mb respectively) gives the full heap default maximum size of 128 megabytes, as
mentioned previously.
Scavenge GCs are done in the “new” arena and are intended to be fast. Full GCs cover both “new”
and “old” arenas and thus are slower than scavenges.
The default start size for the “new” arena is 2 Mb. The default maximum size for the “new” arena is
16 Mb.
The default start size for the “old” arena is 4 Mb. The default maximum size for the “old” arena is 48
Mb, as indicated in the shaded areas above. These sizes can be changed using the
?? –Xms, (heap start size)
?? -Xmx (maximum heap size) and
?? –Xmn (size of the new arena)
© 2001 HEWLETT-PACKARD COMPANY 17
?? -XX:SurvivorRatio= (size of the Eden area divided by the size of the “from” or “two”
areas, which are equal)
JVM options. Each of the first three options above may be followed by a figure signifying the
number of megabytes to assign. For example, –Xmn256m signifies allocation of a new arena of size
256 megabytes.
One recommendation is that the –Xmn value should be one-third the value of the –Xmx value. For
aggressive tuning situations, it is possible to set the value given with –Xmn to be one-half that of the
–Xmx value.
The “Eden” subspace within the “new” arena is that area of memory into which new objects are
placed when they are first created. This is a very important space, because all user defined objects are
born at some time in the life of the program – and may live for a long time or not. Here is an
example of a Java program creating a new object of the class Panel – a common user interface class.
This new object will live for some time in the “Eden” subspace, until it is copied elsewhere, perhaps
into the “Old” space, if it is used for a long time in the program.
Panel p = new Panel();
Listing 10. Creating a new object of the class Panel
Naturally, the Eden space can fill up over time, as more and more new objects are placed into it.
When Eden becomes full, then a Scavenge GC starts and causes those objects, which have existing
references to them, to be copied out of Eden into the “to” space.
The “to” space is one of the two “survivor” spaces, which are there to hold in-use objects for some
period, delaying their entry into the “old” arena, until they have “survived” a number of GC events.
The other “survivor” space is called the “from” space.
Objects which have no references to them are removed from Eden at this point, causing the space
they formerly occupied to be made available. These are the objects which have been garbage
collected.
Any objects that may have been waiting in the “from” subspace from previous GC events are also
copied into the “to” arena and then the spaces swap their names –“from” becomes “to” and vice
versa.
Objects are copied back and forth between “from” and “to” up to a certain copying limit, set by the
“MaxTenuringThreshold” variable, seen above at its default value of 32. This means that objects may
“survive” in the new space for up to 32 Scavenge GCs and 32 from-to renaming activities, before
they are finally “tenured” and placed into the old arena. This old arena is intended for long lived
objects during a program’s lifetime.
Now we can understand the output from the –Xverbosegc a little more easily. The entry in Listing 8
above that reads:
eden: 1834928->0/3670016
Listing 11. Eden size figures from the processVerboseGC122.awk output
© 2001 HEWLETT-PACKARD COMPANY 18
indicates the size of the Eden subspace which was occupied before the current GC event (the
number before the arrow), the size consumed in Eden after the event (the number after the arrow)
and the size of Eden as a whole (the figure after the “/” sign). This entry reads as follows:
?? New objects taking up 1834928 bytes occupied the Eden space before this GC activity.
?? Those objects were completely removed from Eden (and placed elsewhere), as part of this
GC action, since the occupancy of Eden after the GC is zero (the figure after the arrow).
?? The total size of the Eden space is 3670016 bytes after the GC has finished.
The same reading technique can be applied to the survivor and old spaces. In this case, the term
“survivor” refers to the sum of the subspaces named “from” and “to”.
The clearing out of the Eden subspace on each Scavenge GC is a normal behavior for the garbage
collector. However, if we were to see that the Survivor figure (“from” and “to”) goes to zero after
each Scavenge GC, we would know that something is wrong. The objects are not being copied back
and forth between “from” and “to” subspaces often enough. Instead, they are being placed into the
Old space faster than they should be. This action will cause the Old space to eventually fill up with
non-long-lived objects. This symptom is called “overflow” and can be detected by finding a sequence
of GC occurrences with the MaxTenuringThreshold at a low value (2-10).
Should we see a situation where there is not enough space in the new arena to keep objects in it for
the correct length of time, then we can firstly adjust the size of the new area using the –Xmn option.
This will expand the size of all the new subspaces, eden, from and to. We do this as follows.
$ java –Xmn160m –Xmx480m –Xms480m ClassName
Listing 12. Running Java with –Xmn, -Xmx and –Xms to specify the New arena size, max total heap size and
start heap size respectively.
Recommendations on the size of the new arena relative to the overall maximum heap size (specified
by the –Xmx value) have ranged from one third to one half, where the latter is regarded as very
aggressive tuning.
We can also adjust the sizes of the “from” and “to” subspaces in relation to the size of Eden by
using the JVM option –XX:SurvivorRatio=, as follows
$ java –Xmn120m –Xmx480m –Xms480m –XX:SurvivorRatio=8 ClassName
Listing 13. running Java with -Xverbosegc
This causes the Eden subspace to be eight times the size of the “from” or “to” spaces (which are
equal in size). The New arena in the above example would be split into an Eden subspace of 96 mb
(i.e. 8 times 12 megabytes) with a “from” and “to” subspace size of 12 mb each, giving a total of 120
mb.
It is recommended that the –XX:SurvivorRatio value be left at the default value, which at JDK 1.3.1
is 8. This means that Eden is eight times the size of either of the “from” or “to” subspaces.
© 2001 HEWLETT-PACKARD COMPANY 19
It is also recommended that the above actions be taken with a view to reducing the number of full
GCs and maintaining the MaxTenuringThreshold as close to 32 as possible. This
MaxTenuringThreshold value adjusts as the garbage collector progresses and so can vary over the life
of the program. The lower the value becomes, the easier it is for objects to be copied out of the
“from” or “to” spaces and into the Old arena. This is an effect that we would wish to avoid.
Because there can be a lot of data produced when the JVM with the –Xverbosegc option over long
periods, then another technique is to import the base –Xverbosegc output data into a spreadsheet
such as MS Excel and plot a graph of program elapsed time versus rate of growth of the new or old
arenas. This technique is documented in detail at the Developer Solution Partner Portal (DSPP) web
site.

© 2001 HEWLETT-PACKARD COMPANY 20
threads, locking and contention
Java is a programming language that allows for the easy creation of threads or asynchronous
procedures within the program. This area can cause problems for programmers who are not
experienced with multi-threaded programs.
The unlimited creation of threads is certainly a bad practice and should be avoided. Many application
server middleware systems depend on the creation of several thousands of threads. HP-UX supports
many threads within one process. The number of such threads is constrained by the value of the HPUX
kernel parameter, “max_thread_proc”. This value can be seen by issuing the “kmtune”
command, as seen in Listing 14 below.
$ kmtune |grep max_thread_proc
Listing 14. Using the HP-UX kmtune command to view the value of a kernel tunable parameter
Fortunately, we have tools that can show us the thread behavior of a program both during its lifetime
and after it has completed. These are Glance/gpm, HPjmeter and the technique known as “kill –3”.
These will be described further below.
Figure 7 shows a picture of the data that Glance/gpm gives the user who is concentrating on a
specific Java process (perhaps within a collection of Java processes) and examining all of the threads
that the Java program has created while it is running. The TID in the leftmost column of the screen
below shows the TID or unique Thread ID for each thread.
It should be noted that the JVM itself needs eleven threads to begin with, which are separate from
any threads that are required by the user program code.
© 2001 HEWLETT-PACKARD COMPANY 21
Figure 7 :Using the process screen in Glance/gpm to see all threads in a running Java program
The dangers for the Java programmer in using threads are
?? Too many threads are created, beyond a limit imposed by the operating system. We can get
an “OutOfMemoryException” error in our Java program, or a message indicating too many
threads. This is an easy problem to solve. The main technique here is to use the HPjconfig
tool on the system to detect the value of the kernel parameter “max_thread_proc” and
change it using the SAM (system administration) tool.
HPjconfig is a free downloadable tool from the HP Java website

?? One thread will lock some or all of the other threads out of the CPU by holding a lock for a
long time on a common resource. Figure 8 shows a picture of the issue with one thread
holding a resource which other threads need to make progress.
Lock contention is a measure of how many threads are trying to acquire the lock, and how often they
attempt it. When the contention is high, threads spend time waiting to enter the object monitor
instead of doing useful work.
The -Xeprof option for the HP-UX version of the Java SDK 1.3.1 (or later) gives the user accurate
data on lock contention, which can be interpreted by HPjmeter.
© 2001 HEWLETT-PACKARD COMPANY 22
Figure 8: The operating system monitor structure for controlling access to a resource by multiple threads
One thread T1 holds a lock on a resource A which is needed by another thread T2. Thread T2 has a
lock on a resource B which is needed by T1. These threads are in a deadly embrace and will not make
any progress until one of them backs off.
These issues are all related to the application software design. Therefore, they can be solved largely
by changing the structure of the software (involving a recompilation). However, the tools to discover
these issues are at the engineer’s disposal and can find the cause of the problem quickly.
The first clue we may get that a thread and locking contention issue comes from the Glance/gpm
output. Figure 9 shows an image of a system where far too much of the CPU time is being consumed
in “System” time (darker color) rather than “User “ time (the lighter colored area). The reverse
should be the case in a healthy system. “User” time should be bulk of the time occupied by the CPU,
as this is time spent executing application code.
© 2001 HEWLETT-PACKARD COMPANY 23
Figure 9: Evidence of higher than normal system time usage as seen in Glance/gpm
This is one indicator that a thread or locking issue may be present. We then use Glance/gpm again to
look at the individual System calls being used by each separate JVM process on the system.
The picture we get of a single process which is undergoing lock contention problems may look
similar to the following one. Here, the number of invocations of particular HP-UX operating system
calls, such as “sched_yield”, “ksleep” and “kwakeup” are particularly high and beyond all other calls
as measured in Glance/gpm by the “Cumulative system call count” column. These calls may differ
from one version of the JVM to another, but the performance engineer should look for all of them in
the highest called list. The arrows in Figure 10 which point to the system call rates for the “send” and
“recv” calls indicate that those calls are being made far less frequently than the “sched_yield” and
“ksleep” calls. The “send” and “recv” calls should be much higher, since they make up the principle
work in this network-oriented program.
If we were to solve the evident thread contention problem here (by re-structuring the thread design
around a polling thread and thread pool model, for example) we should see the “send” and “recv”
calls at much higher rates.
© 2001 HEWLETT-PACKARD COMPANY 24
Figure 10: High System call rates for certain operating system calls made by a single process – possible thread contention
Such problem situations have arisen in projects where one thread is used for each socket that is open
for network traffic. Where many hundreds or thousands of network connections are being
established concurrently, this can cause very high overhead in thread contention. This problem has
been corrected in past projects by implementation of the HP Poll API, which is a patter of designing
an application to use fewer threads. This is a topic which is beyond the scope of this paper. In the
Java SDK 1.4 there is also a subsystem which allows for more control of threads in asynchronous
I/O environments.
getting a thread data dump using kill -3
This technique for analyzing the threads within a JVM is very simple, but very powerful. Using the
“kill” command with parameters “-3” or “-s SIGQUIT” on a running JVM process will not damage
that running JVM, but instead cause the JVM to produce a fully detailed dump of all the information
it holds on the threads that are present within it at that moment. Use Glance/gpm, or a simple
“ps –ef|grep Java”
© 2001 HEWLETT-PACKARD COMPANY 25
command to get the process identity of the JVM we are interested in examining. Then the command
shown in Listing 15 below will produce a full dump of the thread data on the standard output
channel from this program.
# kill –3 3493 (if 3493 were the PID of the running JVM)
Listing 15. Using the kill –3 command on a JVM to cause it to do a thread dump
We may wish to redirect the standard output to a file when we invoke the JVM to start with.
The kill command above may require that the user have superuser privileges. The data produced on
the standard output of the running JVM (which would be required to be redirected to a file, if it were
of any large size) would look somewhat similar to the following sample.
Figure 11 : Part output data from a “kill –3 ” command on a running JVM process identified by
Figure 11 shows that a particular thread (whose lightweight process identity, lwp_id above, is
14165) is in the suspended state, as it has been waiting for some time to lock a particular object
(identified by its Hexadecimal address). This is just a smaller snapshot of a larger dump of data, but
gives the reader the flavor of this output. What we are particularly looking for in this output is one
thread holding a lock on an object which is required to be locked by other threads, for a protracted
period of time. This causes the entire JVM to slow down and is a problem.
We can repeat the action of “kill –3 ” several times throughout the lifetime
of the problem program, to determine if a continual locking issue is the problem. However, when
dealing with a program which has a large number of threads in it, this can produce a huge amount of
data. It is very difficult to comb through all of this data searching for the problem. As an added
bonus, we can use the HPjmeter tool to do some of this work for us.
© 2001 HEWLETT-PACKARD COMPANY 26
using the hpjmeter tool to detect thread locking problems
A simpler approach to analyzing the thread behavior of the JVM is to run the program to completion
with the HP special option –Xeprof and then analyze the output of this using the HPjmeter tool.
If the program which is executing within the JVM can be terminated in a clean way (i.e. using a
System.exit() or some graceful shutdown procedure) then the JVM should be run with the extended
profiling option (-Xeprof) as follows.
$ java –Xeprof:file=myfile.eprof ClassName
Listing 16. Running a Java program with the –Xeprof option to perform monitoring on the run.
The tool we will use to analyze the output produced into the file named here (myfile.prof, for
example) is the HPjmeter profiling tool, which can be downloaded free of charge from

This tool is written in Java itself and therefore can be run on any platform that has a JVM supported.
The tool will read output files generated by the standard JVM option –Xprof. However, the tool
yields most benefit and detail from the output produced by the –Xeprof option, specific to HP’s
JVM.
The command to run the HPjmeter tool (on HP-UX or anywhere that Java is supported) is shown in
Listing 17 below.
$ java –cp /opt/hpjmeter/HPjmeter.jar HPjmeter
Listing 17. Running the HPjmeter tool.
This free tool has many useful metrics, not only those associated with thread lifetimes. These metrics
are described in detail in a tutorial at the download site. We focus in this section of the document on
its use for thread analysis. The HPjmeter tool produces a graphical image of the thread states and
their lifetime within the JVM run. We load the file produced by the –Xeprof output into the tool and
then select the menu item named “Metric” for threads histogram to see the following screen.
© 2001 HEWLETT-PACKARD COMPANY 27
Figure 12. Screen showing thread lifetimes and condition, in the HPjmeter tool.
This allows us to see all of the thread lifetimes at once and to focus in on a single problem thread or
set of threads for further analysis, by double-clicking on its representative colored bar. Thread 0 in
the picture above is spending 77.1% of its lifetime executing in the CPU; this is a healthy sign. Some
21.4% of this thread’s reporting time is spent in profiler overhead, which gives us very little other
time spent waiting to execute or in lock contention.
Using HP’s Java SDK 1.3.1+, and HPjmeter 1.2+ we can get a very accurate picture of the thread
contention issues that may be slowing down the behavior of a Java program.
HPjmeter is useful for analysis in many other situations besides thread studies. Methods which
consume large amounts of CPU and wall clock time can also be tracked down using certain of the
metrics given by this tool. This is detailed in the “Expensive Method Calls” section below.
a final word on threads
Java program threads are mapped to operating system threads, also called HP-UX lightweight
processes (LWPs), by the Java virtual machine. The unit of scheduling in HP-UX is the individual
© 2001 HEWLETT-PACKARD COMPANY 28
thread. Each thread is visible in the Glance tool, for example, with its unique LWP identity (LWP
ID), using the “Thread List” feature.
Initially, each thread’s priority starts at a value which is common to all user processes across the
operating system. As the thread executes in a CPU, under normal conditions, its priority is decreased
as it gets more CPU time. This causes that thread to be diminished in priority on leaving the CPU. It
is therefore possibly at a disadvantage, compared to other threads, the next time a priority-based
scheduling decision is taken on it by the operating system.
Should a particular Java-based process be the single most important user process on a computer,
then that process may be given higher scheduling priority either at startup time, or during its lifetime,
by intervention of the superuser. This is done using the “nice” or “renice” commands, respectively,
or by using the “renice” capability of Glance/gpm.
The nice command is shown in Listing 18 below.
# nice –-20 java ClassName (notice there are 2 minus signs)
Listing 18. Running a Java program with the “nice” command.
The renice command is used based on the process ID of a running Java program, which is found
using the “ps –ef” command or in Glance/gpm, using the Process List screen. The example Java
process ID is 1234 as used in Listing 19 below.
# renice –20 1234 (notice there is 1 minus sign)
Listing 19. Running a “renice” command on an execuring Java program.
Both the nice and renice commands are to be treated with care. They are executed only while logged
into the computer as superuser. These commands change the priority of every thread in the reniced
process.
The result of executing a command such as those above will be that other processes on the same
computer will be disadvantaged with respect to the “reniced” one, and may get lower CPU time than
they would if this command were not used. However, this may be a trade-off that is worthwhile in
certain circumstances.
expensive method calls
It is sometimes the case in Java programs that a small number of the program’s methods occupy
most of the time or resources consumed by that program. These are clearly the main targets for
tuning in the application design.
Certain methods may lend themselves to being tuned without changing the application design or
changing the source code. The first step is to discover the identity and performance characteristics of
these methods. The HPjmeter tool is ideal for this analysis.
HPjmeter has the distinct advantage of being freely available for all environments that run Java.
Secondly, the –Xeprof option to the JVM, which produces the data for HPjmeter analysis, is
designed specifically for low intrusion on the program it is profiling.
© 2001 HEWLETT-PACKARD COMPANY 29
The style in which we invoke HPjmeter is shown in Listing 17, above. Figure 13 shows the output
within HPjmeter for the highest number of method
calls.
Figure 13. Highest Java program method call counts, as seen in the HPjmeter tool.
It is clear from Figure 13 that the Mark.SimSearch.bel() method has the highest call count in the list
of methods being called in this particular program. The “bel()” method call count outstrips the
number of invocations to all other methods by a large number. We look closely at this method,
perhaps using the other HPjmeter metrics, such as “Exclusive CPU Method Time”, in order to
establish whether this method is actually using a large amount of the program’s resources, such as
CPU cycles or otherwise. We then perform some changes to the design of this method and its use in
the program in order to remove this bottleneck.
In certain cases, particularly related to objects of the formatted Date class within Java programs, we
may be able to effect a change to a high-running method without changing the source code itself.
When formatted Date objects are used very heavily in Java programs, the methods
“currentTimeMillis” or “getTimeOfDay” may appear at the top of the method call count metric
output in HPjmeter.
Date objects can be used heavily when records are being written to, or read from, a database using
JDBC.
© 2001 HEWLETT-PACKARD COMPANY 30
Provided we have established that this is having a severe detrimental effect on the performance of
the program, we may use an option, shown in Listing 20, to resolve this problem. This option has the
effect of diminishing the impact of date/time functions (which are atomic operations in the
operating system) used by the JVM executing this program.
$ java –XX:-UseHighResolutionTimer ClassName
Listing 20. Running a Java program with an option to reduce timer/date method impact.
memory leaks
One of the most concerning aspects of the Java development world is the possibility of a program
failing due to its having run out of memory. This may exhibit itself as a
“java.lang.OutOfMemoryException”, but it may also appear in a form which is not so friendly, such
as a halt in the program’s execution.
Programmers using Java allocate memory for new objects as described in the section on garbage
collection above. They should not, in theory, have to worry about clearing up that memory once they
have finished with those objects in the program.
However, unless all references to an object have been removed (set to “null” or out of scope), then
the JVM expects that such objects may be used again in the life of the program. This is referred to as
the “memory retention” problem. It differs quite a lot from the notion of a memory leak in other
languages, such as in C++.
In C++, an occurrence of a memory leak can be due to code that follows the pattern given below (a
small snippet from a program).
ptr = new LargeObjectType(); // where ptr is a pointer of a correct type
// The object is used for some purpose, then it is finished with.
ptr = null;
// The programmer has not used “delete ptr” to free space before setting ptr to null
Listing 21. A Memory Leak example in C++.
The problem is not the same in Java, as setting a reference to an object to “null” is a good practice,
indicating that the program has done with the object. Furthermore, the Java programmer has no
opportunity to explicitly free the memory assigned to the object, as there is no “free” or “delete”
operator in the language. In Java, the “memory retention” problem arises from the programmer
OMITTING to set the reference to “null” once he/she is done with the object.
If this is a large object, with possibly other objects that are referenced by it, in a chain, then the entire
segment of memory occupied by these objects will be unavailable for the time they are in scope.
Repeated allocation of such misbehaved objects can cause the memory for a program to run out,
© 2001 HEWLETT-PACKARD COMPANY 31
when executing over longer periods of time. This will cause the JVM to halt in an ungraceful manner
at unexpected times.
Java memory retention problems cannot generally be solved other than by reworking segments of the
offending code. The correction of these issues will not be dealt with here. Programmers in Java are
rapidly adopting tools such as the Jprobe Profiler tool from Sitraka Corporation, for earlier,
development time detection of these issues. The reader is referred to the tutorials and documentation
on this tool for further detail, [Jprobe].
Detection of such memory leaks or retention problems is an issue which the beginner can tackle,
however. For this purpose, the “Memory Regions” screen within the Glance/gpm tool is an
invaluable asset. The performance engineer chooses the process (the JVM usually appears with the
name “java” on the process list screen in Glance/gpm) under investigation and then uses the Glance
tool to show the report of all of the memory regions being used by this process.
This produces a window similar to Figure 14.
Figure 14. Memory Regions used by a process shown in Glance/gpm for that particular Java process.
Figure 14 shows us that the Data RSS and Data VSS (Resident set size and Virtual set size) values
give the memory allocation for the C program heap. The Other RSS, Other VSS and Private RSS
sizes tell us about the size of memory allocated to the Java heap for this program. Continual growth
in these spaces, or in any one of them, leads us to believe that there is a memory retention condition
in this Java program.
Many Java programs call out to C/C++ programs behind the scenes. Frequently, analysis shows that
the memory leak or retention issue has been in the C/C++ program underlying some Java program.
© 2001 HEWLETT-PACKARD COMPANY 32
We should therefore analyze the behavior of the Data RSS and VSS sizes when we are looking for an
issue of this nature.
Should the program or set of programs we are monitoring take a long time to exhibit the behavior,
then the programmer will not be able to watch the screen above for that protracted period. For this
purpose, the “adviser mode” feature within the Glance/gpm toolkit can be very useful. We are then
using a batch mode form of the tool, which gathers data into a file for later analysis.
As an example, the command shown in Listing 22 below causes Glance/gpm to write a sample every
5 seconds (where you can specify the interval) and uses the commands specified in the
adviser_commands file.
$ /opt/perf/bin/glance -adviser_only -syntax adviser_commands -j 5
Listing 22. Using Glance in Adviser Mode.
An example of the type of syntax that can be specified in the “adviser_commands” file is given
below.
PROCESS LOOP {
if proc_proc_name == "Java" then {
PRINT proc_proc_name|8|0,
proc_proc_id|8|0,
proc_cpu_total_util|8|2,
proc_cpu_nnice_time|8|2,
proc_mem_virt,
proc_mem_res,
proc_thread_count,
proc_io_byte_rate
}
}
Listing 23. Example of Glance Adviser Mod Command File.
Figure 15, below, shows a sample output from a Glance run in adviser mode, with a highlighted
problem.
© 2001 HEWLETT-PACKARD COMPANY 33
Figure 15 : Sample output from the Glance tool in adviser mode, with a highlighted problem
benchmarking symptoms
The Java Software Developer Kit (SDK) 1.3.1 contains the HotSpot Runtime Compiler version 1.3.1.
(referred to as “HotSpot”) at the time of writing of this document.
This HotSpot technology is the default runtime used whenever the user invokes the JVM (unless the
user specifies the –classic option, which is generally to be avoided). It analyses the behavior of the
Java code at runtime and compiles certain sections of the byte codes to an intermediate
representation and further to native binary code as it optimizes the run. Execution of Java byte codes
can switch, under HotSpot’s control, from interpreted mode to compiled native code under certain
conditions during the run of a Java program. One important condition for this “compilation”
process is the repeated execution of a method which has many back-branches or loops in it.
In analyzing the performance of a variety of benchmarks, we have found that some of those
benchmarks appear to behave more slowly under the HotSpot default JVM than when run with
others. This is a symptom of the “warm-up” period required for the HotSpot runtime to notice that a
© 2001 HEWLETT-PACKARD COMPANY 34
method is frequently executed and has the appropriate type of logic for HotSpot compilation to
native code.
This phenomenon can generally be worked around by placing the method which should be
optimized into a loop. This causes the HotSpot runtime compiler to actually compile it. This
technique has shown orders of magnitude increases in the execution time of certain benchmarks.
The example of this type of occurrence is that program in which the “main” method does all of the
processing and never calls another method. An example of this is shown in Listing 24.
public class SimpleBenchmark {
public static void main(String[] argv) {
int value=0;
// Record the start time.
long start= System.currentTimeMillis();
// Repeatedly executes feature to measure performance.
for (int i=0; i<10000000; i++) {
// Replace line with your favorite computation.
value +=i;
}
// Record the finish time.
long finish= System.currentTimeMillis();
// Now report how long test ran.
System.out.print ("Time spent = " +
Long.toString(finish - start) + " ms\n");
}
}
Listing 24. Example program which does NOT benefit from HotSpot runtime compilation
Notice in Listing 24, that the “main” method does all the work. No opportunity is available for the
HotSpot runtime compiler to optimize the code to allow compiled code to be executed rather than
byte codes. This example program will operate more slowly than its equivalent shown in Listing 25,
which does the same work.
public class HotSpotBenchmark {
public static void runTest() {
int value=0;
// Repeatedly executes feature to measure performance.
for (int i=0; i<10000000; i++) {
// Replace line with your favorite computation.
value +=i;
}
}
public static void main(String[] argv) {
// Run benchmark multiple times. This will allow us to
// see when HotSpot begins executing compiled code.
for (int i = 0; i < 8; i++) {
// Record the start time.
long start= System.currentTimeMillis();
// Run benchmark test.
runTest();
© 2001 HEWLETT-PACKARD COMPANY 35
// Record the finish time.
long finish= System.currentTimeMillis();
// Now report how long test ran.
System.out.print ("Time spent = " +
Long.toString(finish - start) + " ms\n");
}
}
}
Listing 25. Example program which DOES benefit from HotSpot runtime compilation
It is seen here that the “big” loop, with 10000000 iterations, now is contained in a method, and is
called within a loop itself, from the “main” method of this class. The “runTest()” method contains a
large number of back branches, from the 10000000 iteration loop, and is called repeatedly itself from
“main”, so it qualifies for runtime compilation by the HotSpot runtime compiler inherent in the
JVM. It therefore operates much faster than the equivalent program in listing 24.
This phenomenon is described in more detail in a separate article contained at the HP Java
Performance Tuning website,
http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,701,2022,00.html
The reader is referred to that site for more information on this topic.
conclusions
The tools and techniques described in this paper can help the engineer to tune the behavior of a Java
program to achieve better performance. The performance engineer should not be discouraged by bad
results obtained from initial testing. The problems underlying these initial results can frequently be
resolved and better results can be achieved.
Diagnosis of the issues that may be influencing the behavior of the program, using tools such as
Glance/gpm, HPjmeter and the garbage collection techniques, should be done first, before any
action is taken to change the runtime performance of the program.
An orderly cycle of architecture assessment, measurement, analysis of data, bottleneck identification,
single-step tuning and re-testing is recommended here. The paper describes the beginner’s approach
to HP’s tools for the examination of
?? Memory management
?? Thread behavior
?? System and JVM configuration
?? Program resource consumption
© 2001 HEWLETT-PACKARD COMPANY 36
in particular, since these issues can yield fruitful results in tuning the performance of a Java
application.
© 2001 HEWLETT-PACKARD COMPANY 37
references
[Sauers] Sauers, R. and Weygant, P., HP-UX Performance Tuning, HP Press
[Shirazi] Shirazi, J., Java Performance Tuning, O’ Reilly Press
[Halter] Halter, S. and Munroe, S., Enterprise Java Performance, Sun Microsystems Press
[Wilson] Wilson, S. and Kesselman, J., Java Platform Performance, Addison Wesley
[Austin] Austin, C. and Pawlan, M., Advanced Programming for the Java 2 Platform, Addison
Wesley

TOOLS
Performance Analysis Tools
HPjmeter - free download
Glance/gpm Available from the HP-UX Application Tools CD
HPjconfig - free download
Jprobe
Introscope
OptimizeIt
Load Testing Tools
LoadRunner http://www.mercuryinteractive.com
SilkPerformer
e-Load
阅读(1904) | 评论(0) | 转发(0) |
0

上一篇:JVM调优

下一篇:JVM基础

给主人留下些什么吧!~~