全部博文(343)
分类: LINUX
2008-06-06 16:36:37
Contents:
Overview
Introduction
Debugging Infrastructure
Debugging Methodology
Examples
References
This article will introduce the user space slab allocator, libumem
, shipped in the Solaris 9 Operating System (Solaris 9 OS), Update 3. Of particular interest is the debugging infrastructure provided by the libumem
library. This paper will focus on the application developer's use of the debugging features provided by the libumem
library to find and fix memory management bugs efficiently within existing code.
First, we will introduce the libumem
library and briefly describe some of the advantages of using the slab allocator for application memory management. Next, we'll describe the details of the debugging infrastructure provided by the libumem
library, and the tools to take advantage of the infrastructure. Finally, we will walk through a few examples using the libumem
library and the Solaris OS Modular Debugger (MDB) to illustrate the ease of finding memory management bugs (that is, memory corruption and leaks).
This article was originally published on Sun's Access1 site and is reprinted with permission.
The creation of the user space slab allocator was inspired by the kernel space slab allocator introduced in SunOS 5.4 [1]. The kernel slab allocator was created by engineers investigating system memory management in an effort to find new ways to make the virtual memory (VM) subsystem faster, more efficient, and scalable. The slab allocator provides faster and more efficient memory allocation by using an object caching strategy. Object caching is a strategy by which memory that is frequently allocated and freed will be cached, so that the overhead of creating the same data structure is decreased.This strategy has proven to be very efficient due to the large amount of variable reuse within most code. The scalability of the slab allocator has been improved upon over the last few years by using a per-CPU set of caches. This addition allowed for a far less contentious locking scheme when requesting memory from the system and thus, has created a more scalable memory allocator. Once the slab allocator was found to work well within kernel space, the next step was to port those wins to user space. Hence, the user space slab allocation library, libumem
, was created. Beginning with the Solaris 9 OS, Update 3, the libumem
library will be a standard part of the Solaris OS.
The user space slab allocator is based upon a set of umem
caches whose size is determined before the first allocation. The umem
caches are built using slabs of memory from the system. The slab nomenclature denotes one or more contiguous virtual memory (VM) pages which are split into equal size chunks called buffers. The buffer contains the user's data and in addition can, depending on the environment settings, contain the debug information that will help the application developer find and repair memory management bugs.
For a detailed discussion about the structure and theory behind the slab allocator, please refer to , by Jeff Bonwick, and , by Jeff Bonwick and Jonathan Adams.
Here we will describe the sections of the buffer created when an application requests memory resources. In addition, the meanings of the boundary values seen within this buffer will be explained. This is intended to provide the developer with an understanding of the way in which the libumem
library sets up the infrastructure by which the application's memory transactions can be scrutinized for validity.
The buffer is divided into four sections, as seen in Figure 1.
Metadata section |
User data section |
Redzone section |
Debug metadata section |
Figure 1: The structure of a buffer created by the libumem
library
The first section is devoted to storing 8 bytes of metadata with which we will not concern ourselves in this article. The second section contains the memory that the application will use to store its data. The third section is called the redzone, and purposely separates the user data and debug metadata sections. In addition, the redzone section contains a value that can be used to determine the size of the application's memory request. The fourth, and final, section is used to store the debug metadata which the developer can use to determine the state and history of the buffer.
The user data section is the portion of the buffer which is reserved for the application's data. To understand the functionality of this section of the buffer we must understand the basic building blocks of the slab allocator, the umem
caches. The slab allocator is based upon umem
caches that consist of buffers with predetermined sizes. Thus, when an application requests memory from the system, the system will allocate memory from the umem
cache that has a user data section of equal or greater size than the request. The size of the user data section will typically be larger than the amount of memory requested by the application, as seen in Figure 2.
Memory available to the application | 0xbb | Memory not available to the application |
Figure 2: The structure of the user data section
Each umem
cache consists of a set of buffers of one predetermined size in order to facilitate object reuse and to minimize memory fragmentation within the system. Therefore, most of the memory allocation requests by an application do not require the full amount of space provided by the buffer's user data section.
The memory requested by the application begins at the start of the user data section and ends at the boundary value of 0xbb
. The 0xbb
boundary value is placed just after the last byte of memory requested by the application. The memory between the 0xbb
value and the start of the redzone section, the next section within the buffer, is not to be used by the application. In the following output from MDB, the 0xbb
value is written just after the tenth byte in the user data section, as is appropriate for a 10-byte application allocation request.
> 0x49fc0/10X 0x49fc0: 12 3a10bfee baddcafe baddcafe baddbbfe baddcafe feedface 11a7 50000 a115c8ed |
Note: The previous hexadecimal dump is the output of a MDB command. This particular command displays ten 4 byte hexadecimal values starting at address 0x49fc0. This output represents an entire libumem
buffer starting from the address 0x49fc0. Please refer to the documentation at for more details about MDB.
If the application requests an amount of memory which happens to be exactly the same size as the user data section predetermined by the size of the umem
cache, the 0xbb
value will occupy the first byte of the redzone section.
Please note that the value 0xbaddcafe
is written to all of the uninitialized memory segments within the buffer's user data section. This is a feature of the debugging infrastructure provided by the libumem
library in order to determine when an application is accessing data that has not been previously initialized.
This section of the buffer is 8 bytes in size and is used to differentiate between the user data section and the debug metadata section within the buffer. The boundary value 0xfeedface
indicates the beginning of the redzone section, as can be seen below.
> 0x49fc0/10X 0x49fc0: 12 3a10bfee baddcafe baddcafe baddbbfe baddcafe feedface 11a7 50000 a115c8ed |
As was noted previously, if the application requests an amount of memory which happens to be exactly the same size as the entire user data section predetermined by the size of the umem
cache, the 0xbb
value will occupy the first byte of the redzone section. Thus, the redzone will not start with 0xfeedface
but with 0xbbedface
.
The redzone boundary value can be verified to determine whether a buffer overflow has taken place. In addition, the last 4 bytes of the redzone section, 0x11a7
in the previous dump, can be used to to verify the amount of memory requested by the application. As can be seen in the /usr/include/umem_impl.h
header file, this value has been encoded by the following macro:
#define UMEM_SIZE_ENCODING(x) ( 251 * (x) + 1 ) |
where the value x is the size of the application's memory request, plus 8 bytes. Thus, we'll use the previous dump to verify this behavior.
> 0x11a7=D 4519 |
By dividing the decimal value of 4519 by 251, and then subtracting 8, we find that the application requested 10 bytes of memory from the system.
This section of the buffer contains 8 bytes that consist of a 4 byte pointer to a umem_bufctl_audit
structure and a 4 byte checksum. The umem_bufctl_audit
structure, as seen within the /usr/include/umem_impl.h
header file, contains the following:
typedef struct umem_bufctl_audit { struct umem_bufctl *bc_next; /* next bufctl struct */ void *bc_addr; /* address of buffer */ struct umem_slab *bc_slab; /* controlling slab */ umem_cache_t *bc_cache; /* controlling cache */ hrtime_t bc_timestamp; /* transaction time */ thread_t bc_thread; /* thread doing transaction */ struct umem_bufctl *bc_lastlog /* last log entry */ void *bc_contents; /* contents at last free */ int bc_depth; /* stack depth */ uintptr_t bc_stack[1]; /* pc stack */ } umem_bufctl_audit_t; |
Of particular interest is the pointer to the stack trace for the last thread that allocated or freed the buffer. The second 4 byte value within the debug metadata section, called the bxstat
value, is a checksum that can be used to verify that the buffer is in a known state. The value of the pointer to the umem_bufctl_audit
structure XOR'ed to the value of the bxstat
checksum should result in 0xa110c8ed
for an allocated buffer (as seen below) or 0xf4eef4ee
for a freed buffer. If this is not the case, the buffer has become corrupt.
> 0x49fc0/10X 0x49fc0: 12 3a10bfee baddcafe baddcafe baddbbfe baddcafe feedface 11a7 50000 a115c8ed > 50000^a115c8ed=K a110c8ed |
The malloc()
and free()
memory management methods are used by many application developers. An application can be written without a dependence on any particular memory management programming interface by using the standard memory management methods malloc()
and free()
. This section will outline the steps needed to take advantage of the libumem
library to debug an application's memory transactions.
libumem
FlagsIf the libumem
library is interposed (by setting the LD_PRELOAD
environment variable) when executing an application, the malloc()
and free()
methods defined within the libumem
library will be used whenever the application calls malloc()
or free()
. In order to take advantage of the debugging infrastructure of the libumem
library, one needs to set the UMEM_DEBUG
and the UMEM_LOGGING
flags in the environment where the application is being executed. The most common values for these flags are as follows: UMEM_DEBUG=default
and UMEM_LOGGING=transaction
. With these settings, a thread ID, high-resolution time stamp, and stack trace are recorded for each memory transaction initiated by the application. In addition, the libumem
library will:
0xbaddcafe
) and previously freed buffers (0xdeadbeef
).
umem_bufctl_audit
structure and a bxstat
checksum. The following are examples of the commands used to set the appropriate debug flags and interpose the libumem
library when executing an application.
(csh)
%(setenv UMEM_DEBUG default; setenv UMEM_LOGGING transaction;
setenv LD_PRELOAD |
or
(bash)
bash-2.04$UMEM_DEBUG=default UMEM_LOGGING=transaction
LD_PRELOAD= |
More details about the debug flags (UMEM_DEBUG
and UMEM_LOGGING
) can be found in the umem_debug(3MALLOC)
man page.
The developer can view the debug information pertaining to an application's memory management transactions by using MDB. The following commands within MDB can be used to provide a great deal of information about the memory transactions that took place during the execution of the application.
::umem_status
umem
indicating if the logging features have been turned on or off> ::umem_status Status: ready and active Concurrency: 1 Logs: transaction=64k Message buffer: |
::findleaks
> ::findleaks CACHE LEAKED BUFCTL CALLER |
::umalog
> ::umalog T-0.000000000 addr=55fb8 umem_alloc_32 libumem.so.1`umem_cache_alloc+0x13c libumem.so.1`umem_alloc+0x44 libumem.so.1`malloc+0x2c main+0x18 _start+0x108 T-0.000457800 addr=49fc0 umem_alloc_24 libumem.so.1`umem_cache_alloc+0x13c libumem.so.1`umem_alloc+0x44 libumem.so.1`malloc+0x2c main+0xc _start+0x108 |
::umem_cache
umem
caches> ::umem_cache ADDR NAME FLAG CFLAG BUFSIZE BUFTOTL 0003c008 umem_magazine_1 000e 80080000 8 0 0003c1c8 umem_magazine_3 000e 80080000 16 0 0003c388 umem_magazine_7 000e 80080000 32 0 0003c548 umem_magazine_15 000e 80080000 64 0 0003c708 umem_magazine_31 000e 80080000 128 0 0003c8c8 umem_magazine_47 000e 80080000 192 0 0003ca88 umem_magazine_63 000e 80080000 256 0 0003cc48 umem_magazine_95 000e 80080000 384 0 0003ce08 umem_magazine_143 000e 80080000 576 0 0003cfc8 umem_slab_cache 000e 80080000 28 170 0003d188 umem_bufctl_cache 000e 80080000 12 0 0003d348 umem_bufctl_audit_cache 000e 80080000 100 408 0003d508 umem_alloc_8 020f 80000000 8 0 0003d6c8 umem_alloc_16 020f 80000000 16 0 0003d888 umem_alloc_24 020f 80000000 24 204 0003da48 umem_alloc_32 020f 80000000 32 170 ...snip... |
[address]::umem_log
umem
transaction log for the application> ::umem_log CPU ADDR BUFADDR TIMESTAMP THREAD 0 0002e064 00055fb8 10475e3dd1c98 00000001 0 0002e000 00049fc0 10475e3d62050 00000001 0003483c 00000000 0 00000000 000348a0 00000000 0 00000000 00034904 00000000 0 00000000 ... snip ... |
[address]::umem_verify
umem
caches which is useful in determining if a buffer has been corrupted> ::umem_verify Cache Name Addr Cache Integrity umem_magazine_1 3c008 clean umem_magazine_3 3c1c8 clean umem_magazine_7 3c388 clean umem_magazine_15 3c548 clean umem_magazine_31 3c708 clean umem_magazine_47 3c8c8 clean umem_magazine_63 3ca88 clean umem_magazine_95 3cc48 clean umem_magazine_143 3ce08 clean umem_slab_cache 3cfc8 clean umem_bufctl_cache 3d188 clean umem_bufctl_audit_cache 3d348 clean umem_alloc_8 3d508 clean umem_alloc_16 3d6c8 clean umem_alloc_24 3d888 clean umem_alloc_32 3da48 clean ... snip ... |
address$
umem_bufctl_audit
structure as defined in the /usr/include/umem_impl.h
header file> 50000$ |
The following basic examples will show how to use MDB in conjunction with the libumem
library to examine the history of an application's memory transactions.
In order to examine if an application has a memory leak, one can execute the following steps to narrow down the section of the code which is causing the leak.
1. The libumem
library is only available on systems which are running the Solaris 9 OS, Update 3 and above.
%uname -a SunOS fountainhead 5.9 Generic_112233-05 |
2. Execute the application with the libumem
library interposed and the appropriate debug flags set.
%(setenv UMEM_DEBUG default; setenv UMEM_LOGGING transaction; \
setenv LD_PRELOAD |
3. Use the gcore
(1) command to get an application core to analyze the application's memory transactions.
%ps -ef | grep a.out user1 970 714 0 10:42:42 pts/4 0:00 ./a.out |
%gcore 970 gcore: core.970 dumped |
4. Use MDB to analyze the core for memory leaks using the commands described in the previous section.
%mdb core.970
Loading modules: [ |
> ::umem_log CPU ADDR BUFADDR TIMESTAMP THREAD 0 0002e0c8 00055fb8 159d27e121a0 00000001 0 0002e064 00055fb8 159d27e0fce8 00000001 0 0002e000 00049fc0 159d27da1748 00000001 00034904 00000000 0 00000000 00034968 00000000 0 00000000 ... snip ... |
Here we can see that there have been three transactions by thread #1 on cpu #0.
> ::umalog T-0.000000000 addr=55fb8 umem_alloc_32 |
The three transactions consist of one allocation to the 24 byte umem
cache, and one memory allocation and release from the 32 byte umem
cache. Note that the high resolution timestamp output in the upper left hand corner is relative to the last memory transaction initiated by the application.
> ::findleaks CACHE LEAKED BUFCTL CALLER 0003d888 1 00050000 libumem.so.1`malloc+0x0 ---------------------------------------------------------------------- Total 1 buffer, 24 bytes |
This shows that there is one 24 byte buffer which has been leaked.
> 00050000$ |
We can find the stack trace for the allocation which resulted in the memory leak by dumping the bufctl
structure. The address of this structure can be gathered from the previous ::findleaks
output.
> 49fc0/10X 0x49fc0: 12 3a10bfee baddcafe baddcafe baddbbfe baddcafe feedface 11a7 50000 a115c8ed |
Looking at the values within the buffer we see that the size of the allocation was 10 bytes. This can be calculated by dividing the redzone value of 0x11a7
by 251, and then subtracting 8 bytes.
%cat test.c #include |
Once we look at the code of the executable, we can use the function in the stack trace and the size of the allocation to determine the piece of memory which has leaked.
The following example will list the steps used to examine an application core for a memory corruption bug.
1. Follow the first two steps listed above in the memory leak example.
2. Either analyze the core dump created by the application if it aborted, or use gcore
as seen above.
3. Use MDB to analyze the application core for the memory corruption using the MDB commands listed in a previous section.
%mdb core.1095
Loading modules: [ |
Using the umem_verify
command we can see that one of the umem
caches has a corrupted buffer.
> 3d888::umem_verify Summary for cache 'umem_alloc_24' buffer 49fc0 (allocated) has a corrupt redzone size encoding |
This provides more detail about the type of corruption that has taken place within the 24 byte umem
cache.
> 49fc0/10X 0x49fc0: 18 3a10bfe8 0 1 2 3 4 1789 50000 a115c8ed |
When we dump out the buffer we can see that the size of the original allocation was 16 bytes. This can be calculated by decoding the redzone value of 1789 by dividing it by 251 and then subtracting the result by 8 bytes. Once we know the size of the allocation we can look for the 0xfeedface
boundary 16 bytes from where the user data section starts. Scrutinizing the buffer above reveals that the user section is filled with 0 through 4, and there is no redzone boundary tag (that is, 0xfeedface
). We find the value 4 where the redzone value should be!
>50000$ |
Getting the stack trace for the last memory transaction will allow the developer to narrow down where in the code the memory corruption is taking place.
%cat test.c #include |
for
loop.
, by Jeff Bonwick
, by Jeff Bonwick and Jonathan Adams