Chinaunix首页 | 论坛 | 博客
  • 博客访问: 522442
  • 博文数量: 184
  • 博客积分: 0
  • 博客等级: 民兵
  • 技术积分: 1172
  • 用 户 组: 普通用户
  • 注册时间: 2016-06-21 13:40
个人简介

技术改变命运

文章分类

全部博文(184)

文章存档

2020年(16)

2017年(12)

2016年(156)

我的朋友

分类: LINUX

2016-07-31 21:37:46

在这部分,我将解释调试器如何明白,在它跋涉机器代码里,在哪里找到C函数与变量,以及它用来在C源代码与机器语言内存字间进行映射的数据。

调试信息

现代编译器在翻译高级语言代码方面做得相当好。其良好缩进及嵌套的控制结构以及任意类型的变量被翻译为一大堆称为机器码的比特,主要目的是在目标CPU上运行得尽可能地快。大多数C代码行被翻译为多条机器代码指令。变量则被到处乱塞——进入栈,进入寄存器,或完全被优化掉。结构体与对象在结果代码里甚至不存在——它们只是一个抽象,被翻译为到内存缓存的写死的偏移。

因此在你要求调试器在某些函数的入口暂停时,它如何知道哪里暂停?在你向它要求一个变量值时,它如何设法确定给你显示什么?答案是——调试信息。

调试信息连同机器代码一起由编译器产生。它代表了可执行程序与源代码的联系。这个信息以一个预定义的格式编码并连同机器码一起保存。多年来为不同的平台及可执行文件发展了许多这样的格式。因为本文的目标不是调查这些格式的历史,而是展示它们如何工作,我们必须解决一些问题。这个问题将是DWARF,在今天它几乎一统在Linux及其他类Unix平台上ELF可执行文件的调试信息。

ELF中的DWARF


根据,DWARF连同ELF一起设计,虽然在理论上它也可以嵌入到其他目标文件格式。DWARF是一个复杂的格式,基于之前用于各种架构及操作系统格式的多年经验。它不得不复杂,因为它解决了一个非常棘手的问题——向调试器展示来自高级语言的调试信息,提供对任意平台及ABI的支持。这不是这篇粗陋的文章能完全解释的,并且说实话我对它角角落落的理解不足以支撑这样的尝试。在本文里我将采取更多动手实践的做法,只显示足够的DWARF以解释实践中调试信息如何工作。

ELF文件里的调试节

首先看一下ELF文件中DWARF信息放在哪里。在每个目标文件里ELF可以任意定义节。节头表定义了存在哪些节以及它们的名字。不同的工具以特殊的方式处理各种节——例如链接器查找某些节,调试器查找另一些。

我们将使用从这个C源代码构建的可执行程序作为本文中的例子,它被编译为tracedprog2:

#include


voiddo_stuff(int my_arg)

{

    int my_local = my_arg + 2;

    int i;


    for (i = 0; i < my_local;++i)

        printf("i = %d\n",i);

}


intmain()

{

    do_stuff(2);

    return0;

}

使用objdump–h转储ELF可执行文件的节头,我们会注意到有几个节名字以.debug_开始——它们是DWARF调试节:

26 .debug_aranges 00000020 00000000  00000000  00001037

                 CONTENTS,READONLY, DEBUGGING

27 .debug_pubnames 00000028 00000000  00000000  00001057

                 CONTENTS,READONLY, DEBUGGING

28 .debug_info  000000cc  00000000  00000000 0000107f

                 CONTENTS,READONLY, DEBUGGING

29 .debug_abbrev 0000008a 00000000  00000000  0000114b

                 CONTENTS, READONLY, DEBUGGING

30 .debug_line  0000006b  00000000  00000000 000011d5

                 CONTENTS,READONLY, DEBUGGING

31 .debug_frame 00000044  00000000  00000000 00001240

                 CONTENTS,READONLY, DEBUGGING

32 .debug_str    000000ae 00000000  00000000  00001284

                 CONTENTS,READONLY, DEBUGGING

33 .debug_loc   00000058  00000000  00000000 00001332

                 CONTENTS,READONLY, DEBUGGING

这里每个节的第一个数字是其大小,最后一个是它在该ELF文件的起始偏移。调试器使用这个信息来读入可执行文件的节。

现在让我们看一下在DWARF中找出有用的调试信息的几个实际例子。

查找函数

我们想做的一个最基本的事情是,当在某个函数上放置断点调试时,期望调试器在其入口暂停执行。为了能够完成这个伟业,调试器必须在高级代码的函数名与该函数开始指令的机器码地址间持有某种映射。

这个信息可以从DWARF通过查看.debug_info节得到。在我们深入之前,需要一点背景知识。DWARF中基本的描述项称为调试信息项(DIE)。每个DIE有一个标签——它的类型,及一组属性。DIE通过兄弟、父子关系联系起来,并且属性的值可以指向其他DIE。

运行:

objdump --dwarf .info tracedprog2

输出相当长,对这个例子我们只关注这些行:

<1><71>: Abbrev Number: 5 (DW_TAG_subprogram)

    <72>   DW_AT_external    : 1

    <73>   DW_AT_name        : (...): do_stuff

    <77>   DW_AT_decl_file   : 1

    <78>   DW_AT_decl_line   : 4

    <79>   DW_AT_prototyped  : 1

    <7a>   DW_AT_low_pc      : 0x8048604

    <7e>   DW_AT_high_pc     : 0x804863e

    <82>   DW_AT_frame_base  : 0x0     (location list)

    <86>   DW_AT_sibling     : <0xb3>


<1>: Abbrev Number: 9 (DW_TAG_subprogram)

       DW_AT_external    : 1

       DW_AT_name        : (...): main

       DW_AT_decl_file   : 1

       DW_AT_decl_line   : 14

       DW_AT_type        : <0x4b>

       DW_AT_low_pc      : 0x804863e

       DW_AT_high_pc     : 0x804865a

       DW_AT_frame_base  : 0x2c    (location list)

有两个标记为DW_TAG_subprogram的项(DIE),它是函数的DWARF术语。注意分别有一个项用于do_stuff及main。还有几个有趣的属性,不过这里让我们感兴趣的是DW_AT_low_pc。这是函数开头的程序计数器(x86的EIP)。对do_stuff它是0x8048604。现在让我们运行objdump-d看一下在这个可执行文件的反汇编代码里这个地址是什么:

08048604 :

 8048604:       55           push   ebp

 8048605:       89 e5        mov   ebp,esp

 8048607:       83 ec 28     sub   esp,0x28

 804860a:       8b 45 08     mov   eax,DWORD PTR [ebp+0x8]

 804860d:       83 c0 02     add   eax,0x2

 8048610:       8945 f4     mov    DWORD PTR [ebp-0xc],eax

 8048613:       c7 45 (...)  mov   DWORD PTR [ebp-0x10],0x0

 804861a:       eb 18        jmp   8048634

 804861c:       b8 20 (...)  mov   eax,0x8048720

 8048621:       8b 55 f0     mov   edx,DWORD PTR [ebp-0x10]

 8048624:       89 54 24 04  mov   DWORD PTR [esp+0x4],edx

 8048628:       89 04 24     mov   DWORD PTR [esp],eax

 804862b:       e8 04 (...)  call  8048534

 8048630:       83 45 f0 01  add   DWORD PTR [ebp-0x10],0x1

 8048634:       8b 45 f0     mov   eax,DWORD PTR [ebp-0x10]

 8048637:       3b 45 f4     cmp   eax,DWORD PTR [ebp-0xc]

 804863a:       7c e0        jl    804861c

 804863c:       c9          leave

 804863d:       c3           ret

确实,0x8048604是do_stuff的开头,因此调试器是有函数与它们在可执行文件位置间的映射的。

查找变量

假定我们真的停在了do_stuff里的一个断点上。我们希望调试器给我们显式变量my_local的值。调试器怎么知道在哪里找到它?最终证明这比查找函数要棘手得多。变量可以位于全局储存区,在栈上,甚至在寄存器里。另外,同名的变量在不同的上下文里具有不同的值。调试信息必须能够反映这些变化,而DWARF确实做到了。

我不准备讨论所有的可能性,但通过一个例子我将展示调试器如何能在do_stuff里找到my_local。让我们从.debug_info开始并且再看一下do_stuff的项,这次还看一下它的几个子项:

<1><71>: Abbrev Number: 5 (DW_TAG_subprogram)

    <72>   DW_AT_external    : 1

    <73>   DW_AT_name        : (...): do_stuff

    <77>   DW_AT_decl_file   : 1

    <78>   DW_AT_decl_line   : 4

    <79>   DW_AT_prototyped  : 1

    <7a>   DW_AT_low_pc      : 0x8048604

    <7e>   DW_AT_high_pc     : 0x804863e

    <82>   DW_AT_frame_base  : 0x0     (location list)

    <86>   DW_AT_sibling     : <0xb3>

 <2><8a>:Abbrev Number: 6 (DW_TAG_formal_parameter)

    <8b>   DW_AT_name        : (...): my_arg

    <8f>   DW_AT_decl_file   : 1

    <90>   DW_AT_decl_line   : 4

    <91>   DW_AT_type        : <0x4b>

    <95>   DW_AT_location    : (...)       (DW_OP_fbreg: 0)

 <2><98>:Abbrev Number: 7 (DW_TAG_variable)

    <99>   DW_AT_name        : (...): my_local

    <9d>   DW_AT_decl_file   : 1

    <9e>   DW_AT_decl_line   : 6

    <9f>   DW_AT_type        : <0x4b>

       DW_AT_location    : (...)     (DW_OP_fbreg: -20)

<2>: Abbrev Number: 8 (DW_TAG_variable)

       DW_AT_name        : i

       DW_AT_decl_file   : 1

       DW_AT_decl_line   : 7

       DW_AT_type        : <0x4b>

       DW_AT_location    : (...)     (DW_OP_fbreg: -24)

注意每个项中尖括号内的第一个数字。这是嵌套级别——在这个例子里带有<2>的项是带有<1>的项的孩子。因此我们知道变量my_local(标记为DW_TAG_variable)是函数do_stuff的孩子。要正确地显式,调试器还对变量的类型感兴趣。在my_local的情形里,类型指向了另一个DIE——<0x4b>。如果我们在objdump的输出里查找它,将会看到它是一个有符号4字节整数。

为了在正在执行进程的内存映像中实际定位该变量,调试器将查找DW_AT_location属性对于my_local,这是DW_OP_fbreg:-20。这表示该变量保存在包含它函数的DW_AT_frame_base属性——该函数栈框基地址的偏移20处。

Do_stuff的DW_AT_frame_base属性具有值0x0(location list),这表示应该在位置列表节查找这个值。让我们看一下:

$ objdump --dwarf .loc tracedprog2


tracedprog2:     fileformat elf32-i386


Contents of the .debug_loc section:


    Offset   Begin   End      Expression

    00000000 0804860408048605 (DW_OP_breg4: 4 )

    00000000 0804860508048607 (DW_OP_breg4: 8 )

    00000000 080486070804863e (DW_OP_breg5: 8 )

    00000000

    0000002c 0804863e0804863f (DW_OP_breg4: 4 )

    0000002c 0804863f08048641 (DW_OP_breg4: 8 )

    0000002c 080486410804865a (DW_OP_breg5: 8 )

    0000002c

我们感兴趣的位置信息是第一个。对于调试器可能到达的每个地址,它指定了当前的栈框基址,它到变量的偏移被计算为从一个寄存器出发的偏移。对于x86,bpreg4援引esp,而bpreg5援引ebp。

再看一下do_stuff的前几条指令是有助于理解的:

08048604 :

 8048604:       55          push  ebp

 8048605:       89 e5       mov   ebp,esp

 8048607:       83 ec 28    sub   esp,0x28

 804860a:       8b 45 08    mov   eax,DWORD PTR [ebp+0x8]

 804860d:       83 c0 02    add   eax,0x2

 8048610:       89 45 f4    mov   DWORD PTR [ebp-0xc],eax

注意仅在执行第二条指令后,ebp才是有意义的,而对于前两个地址,基址是从列在上面的位置信息中的esp算出的。一旦ebp可用,计算相对于它的偏移就方便了,因为它保存不变,而esp随着数据从栈内压入、弹出不停地变动。

这样它把my_local给我们丢哪儿了?在0x8048610处指令之后,我们只关心其值(在eax中计算之后,其值放入内存),因此调试器将使用DW_OP_breg5:8栈框指针来查找它。现在往回一点,回忆my_local的DW_AT_location属性显示DW_OP_fbreg:-20。让我们算一下:从栈框指针ebp+ 8减去20。得到ebp-12。再看一下反汇编,注意该数据确实从eax迁移过来,而ebp- 12就是my_local保存的地方。

查找行号

在我们谈论在调试信息里查找函数时,我撒了点小谎。在我们调试C源代码并在一个函数里放置断点时,我们通常对第一条机器代码指令不感兴趣。我们真正感兴趣的是函数的第一行C代码。

这是为什么DWARF在可执行文件里加入了C源代码行与机器代码地址的完整映射。这个信息包含在.debug_line节,并可以如下的一个可读形式提取出来:

$ objdump --dwarf=decodedline tracedprog2


tracedprog2:     fileformat elf32-i386


Decoded dump of debug contents of section .debug_line:


CU: /home/eliben/eli/eliben-code/debugger/tracedprog2.c:

File name           Linenumber    Starting address

tracedprog2.c               5           0x8048604

tracedprog2.c               6           0x804860a

tracedprog2.c               9           0x8048613

tracedprog2.c              10           0x804861c

tracedprog2.c               9           0x8048630

tracedprog2.c              11           0x804863c

tracedprog2.c              15           0x804863e

tracedprog2.c              16           0x8048647

tracedprog2.c              17           0x8048653

tracedprog2.c              18           0x8048658

应该不难看出C源代码与反汇编转储之间的对应关系。行号5指向do_stuff的入口——0x8040604。下一行,6,在要求在do_stuff里暂停时,是调试器应该停止的地方,它指向0x804860a,刚刚超过函数的prologue。这个行信息很容易支持行与地址间的双向映射:

·        当要求在某行上放置一个断点时,调试器使用它找出应该在哪个地址上放置它的“陷阱”(记得前一篇文章里我们的朋友int3吗?)

·        当一条指令导致段错误时,调试器使用它找出所对应的源代码行。

Libdwarf——在编程中使用DWARF

应用命令行工具访问DWARF信息,虽然有用,但不完全令人满意。作为程序员,我们更喜欢知道如何编写可以读这个格式并从中提取我们所需的实际代码。

自然地,一个做法是抓住DWARF规范并开始淘宝。现在,记得人们总是不停地说你永远不应该手动地解析HTML,而是应该使用库吗?好吧,对DWARF会甚至更糟。DWARF比HTML复杂得多。我在这里展示的只是冰山一角,让事情变得更困难的是,大多数信息以非常紧凑且高度压缩的方式编码在实际的目标文件里。因此我们将采取另一个方式,使用库来处理DWARF。我知道的有两个主要的库(加上几个不那么完整的库):

1.      使用的BFD(libbfd),包括在本文里的明星objdump,ld(GNU链接器)以及as(GNU汇编器)。

2.      Libdwarf——与它的哥哥libelf一起由Solaris及FreeBSD操作系统的工具使用。

原文:


In this part

I'm going to explain how the debugger figures out where to find the C functions and variables in the machine code it wades through, and the data it uses to map between C source code lines and machine language words.

Debugging information

Modern compilers do a pretty good job converting your high-level code, with its nicely indented and nested control structures and arbitrarily typed variables into a big pile of bits called machine code, the sole purpose of which is to run as fast as possible on the target CPU. Most lines of C get converted into several machine code instructions. Variables are shoved all over the place - into the stack, into registers, or completely optimized away. Structures and objects don't even exist in the resulting code - they're merely an abstraction that gets translated to hard-coded offsets into memory buffers.

So how does a debugger know where to stop when you ask it to break at the entry to some function? How does it manage to find what to show you when you ask it for the value of a variable? The answer is - debugging information.

Debugging information is generated by the compiler together with the machine code. It is a representation of the relationship between the executable program and the original source code. This information is encoded into a pre-defined format and stored alongside the machine code. Many such formats were invented over the years for different platforms and executable files. Since the aim of this article isn't to survey the history of these formats, but rather to show how they work, we'll have to settle on something. This something is going to be DWARF, which is almost ubiquitously used today as the debugging information format for ELF executables on Linux and other Unix-y platforms.

The DWARF in the ELF

According to , DWARF was designed alongside ELF, although it can in theory be embedded in other object file formats as well .

DWARF is a complex format, building on many years of experience with previous formats for various architectures and operating systems. It has to be complex, since it solves a very tricky problem - presenting debugging information from any high-level language to debuggers, providing support for arbitrary platforms and ABIs. It would take much more than this humble article to explain it fully, and to be honest I don't understand all its dark corners well enough to engage in such an endeavor anyway . In this article I will take a more hands-on approach, showing just enough of DWARF to explain how debugging information works in practical terms.

Debug sections in ELF files

First let's take a glimpse of where the DWARF info is placed inside ELF files. ELF defines arbitrary sections that may exist in each object file. A section header table defines which sections exist and their names. Different tools treat various sections in special ways - for example the linker is looking for some sections, the debugger for others.

We'll be using an executable built from this C source for our experiments in this article, compiled into tracedprog2:

#include  void do_stuff(int my_arg)
{ int my_local = my_arg + 2; int i; for (i = 0; i < my_local; ++i)
        printf("i = %d\n", i);
} int main()
{
    do_stuff(2); return 0;
}

Dumping the section headers from the ELF executable using objdump -h we'll notice several sections with names beginning with .debug_ - these are the DWARF debugging sections:

26 .debug_aranges 00000020  00000000  00000000  00001037
                 CONTENTS, READONLY, DEBUGGING
27 .debug_pubnames 00000028  00000000  00000000  00001057
                 CONTENTS, READONLY, DEBUGGING
28 .debug_info   000000cc  00000000  00000000  0000107f
                 CONTENTS, READONLY, DEBUGGING
29 .debug_abbrev 0000008a  00000000  00000000  0000114b
                 CONTENTS, READONLY, DEBUGGING
30 .debug_line   0000006b  00000000  00000000  000011d5
                 CONTENTS, READONLY, DEBUGGING
31 .debug_frame  00000044  00000000  00000000  00001240
                 CONTENTS, READONLY, DEBUGGING
32 .debug_str    000000ae  00000000  00000000  00001284
                 CONTENTS, READONLY, DEBUGGING
33 .debug_loc    00000058  00000000  00000000  00001332
                 CONTENTS, READONLY, DEBUGGING

The first number seen for each section here is its size, and the last is the offset where it begins in the ELF file. The debugger uses this information to read the section from the executable.

Now let's see a few practical examples of finding useful debug information in DWARF.

Finding functions

One of the most basic things we want to do when debugging is placing breakpoints at some function, expecting the debugger to break right at its entrance. To be able to perform this feat, the debugger must have some mapping between a function name in the high-level code and the address in the machine code where the instructions for this function begin.

This information can be obtained from DWARF by looking at the .debug_info section. Before we go further, a bit of background. The basic descriptive entity in DWARF is called the Debugging Information Entry (DIE). Each DIE has a tag - its type, and a set of attributes. DIEs are interlinked via sibling and child links, and values of attributes can point at other DIEs.

Let's run:

objdump --dwarf=info tracedprog2

The output is quite long, and for this example we'll just focus on these lines :

<1><71>: Abbrev Number: 5 (DW_TAG_subprogram)
    <72>   DW_AT_external    : 1
    <73>   DW_AT_name        : (...): do_stuff
    <77>   DW_AT_decl_file   : 1
    <78>   DW_AT_decl_line   : 4
    <79>   DW_AT_prototyped  : 1
    <7a>   DW_AT_low_pc      : 0x8048604
    <7e>   DW_AT_high_pc     : 0x804863e
    <82>   DW_AT_frame_base  : 0x0      (location list)
    <86>   DW_AT_sibling     : <0xb3>

<1>: Abbrev Number: 9 (DW_TAG_subprogram)
       DW_AT_external    : 1
       DW_AT_name        : (...): main
       DW_AT_decl_file   : 1
       DW_AT_decl_line   : 14
       DW_AT_type        : <0x4b>
       DW_AT_low_pc      : 0x804863e
       DW_AT_high_pc     : 0x804865a
       DW_AT_frame_base  : 0x2c     (location list)

There are two entries (DIEs) tagged DW_TAG_subprogram, which is a function in DWARF's jargon. Note that there's an entry for do_stuff and an entry for main. There are several interesting attributes, but the one that interests us here is DW_AT_low_pc. This is the program-counter (EIP in x86) value for the beginning of the function. Note that it's 0x8048604 for do_stuff. Now let's see what this address is in the disassembly of the executable by running objdump -d:

08048604 :
 8048604:       55           push   ebp
 8048605:       89 e5        mov    ebp,esp
 8048607:       83 ec 28     sub    esp,0x28
 804860a:       8b 45 08     mov    eax,DWORD PTR [ebp+0x8]
 804860d:       83 c0 02     add    eax,0x2
 8048610:       89 45 f4     mov    DWORD PTR [ebp-0xc],eax
 8048613:       c7 45 (...)  mov    DWORD PTR [ebp-0x10],0x0
 804861a:       eb 18        jmp    8048634 
 804861c:       b8 20 (...)  mov    eax,0x8048720
 8048621:       8b 55 f0     mov    edx,DWORD PTR [ebp-0x10]
 8048624:       89 54 24 04  mov    DWORD PTR [esp+0x4],edx
 8048628:       89 04 24     mov    DWORD PTR [esp],eax
 804862b:       e8 04 (...)  call   8048534 
 8048630:       83 45 f0 01  add    DWORD PTR [ebp-0x10],0x1
 8048634:       8b 45 f0     mov    eax,DWORD PTR [ebp-0x10]
 8048637:       3b 45 f4     cmp    eax,DWORD PTR [ebp-0xc]
 804863a:       7c e0        jl     804861c 
 804863c:       c9           leave
 804863d:       c3           ret

Indeed, 0x8048604 is the beginning of do_stuff, so the debugger can have a mapping between functions and their locations in the executable.

Finding variables

Suppose that we've indeed stopped at a breakpoint inside do_stuff. We want to ask the debugger to show us the value of the my_local variable. How does it know where to find it? Turns out this is much trickier than finding functions. Variables can be located in global storage, on the stack, and even in registers. Additionally, variables with the same name can have different values in different lexical scopes. The debugging information has to be able to reflect all these variations, and indeed DWARF does.

I won't cover all the possibilities, but as an example I'll demonstrate how the debugger can find my_local in do_stuff. Let's start at .debug_info and look at the entry for do_stuff again, this time also looking at a couple of its sub-entries:

<1><71>: Abbrev Number: 5 (DW_TAG_subprogram)
    <72>   DW_AT_external    : 1
    <73>   DW_AT_name        : (...): do_stuff
    <77>   DW_AT_decl_file   : 1
    <78>   DW_AT_decl_line   : 4
    <79>   DW_AT_prototyped  : 1
    <7a>   DW_AT_low_pc      : 0x8048604
    <7e>   DW_AT_high_pc     : 0x804863e
    <82>   DW_AT_frame_base  : 0x0      (location list)
    <86>   DW_AT_sibling     : <0xb3>
 <2><8a>: Abbrev Number: 6 (DW_TAG_formal_parameter)
    <8b>   DW_AT_name        : (...): my_arg
    <8f>   DW_AT_decl_file   : 1
    <90>   DW_AT_decl_line   : 4
    <91>   DW_AT_type        : <0x4b>
    <95>   DW_AT_location    : (...)       (DW_OP_fbreg: 0)
 <2><98>: Abbrev Number: 7 (DW_TAG_variable)
    <99>   DW_AT_name        : (...): my_local
    <9d>   DW_AT_decl_file   : 1
    <9e>   DW_AT_decl_line   : 6
    <9f>   DW_AT_type        : <0x4b>
       DW_AT_location    : (...)      (DW_OP_fbreg: -20)
<2>: Abbrev Number: 8 (DW_TAG_variable)
       DW_AT_name        : i
       DW_AT_decl_file   : 1
       DW_AT_decl_line   : 7
       DW_AT_type        : <0x4b>
       DW_AT_location    : (...)      (DW_OP_fbreg: -24)

Note the first number inside the angle brackets in each entry. This is the nesting level - in this example entries with <2> are children of the entry with <1>. So we know that the variable my_local (marked by the DW_TAG_variable tag) is a child of the do_stuff function. The debugger is also interested in a variable's type to be able to display it correctly. In the case of my_local the type points to another DIE - <0x4b>. If we look it up in the output ofobjdump we'll see it's a signed 4-byte integer.

To actually locate the variable in the memory image of the executing process, the debugger will look at the DW_AT_location attribute. For my_local it says DW_OP_fbreg: -20. This means that the variable is stored at offset -20 from the DW_AT_frame_base attribute of its containing function - which is the base of the frame for the function.

The DW_AT_frame_base attribute of do_stuff has the value 0x0 (location list), which means that this value actually has to be looked up in the location list section. Let's look at it:

$ objdump --dwarf=loc tracedprog2//centos上运行不了,改为上面的

tracedprog2:     file format elf32-i386

Contents of the .debug_loc section:

    Offset   Begin    End      Expression
    00000000 08048604 08048605 (DW_OP_breg4: 4 )
    00000000 08048605 08048607 (DW_OP_breg4: 8 )
    00000000 08048607 0804863e (DW_OP_breg5: 8 )
    00000000 
    0000002c 0804863e 0804863f (DW_OP_breg4: 4 )
    0000002c 0804863f 08048641 (DW_OP_breg4: 8 )
    0000002c 08048641 0804865a (DW_OP_breg5: 8 )
    0000002c 

The location information we're interested in is the first one . For each address where the debugger may be, it specifies the current frame base from which offsets to variables are to be computed as an offset from a register. For x86, bpreg4 refers to esp and bpreg5 refers toebp.

It's educational to look at the first several instructions of do_stuff again:

08048604 :
 8048604:       55          push   ebp
 8048605:       89 e5       mov    ebp,esp
 8048607:       83 ec 28    sub    esp,0x28
 804860a:       8b 45 08    mov    eax,DWORD PTR [ebp+0x8]
 804860d:       83 c0 02    add    eax,0x2
 8048610:       89 45 f4    mov    DWORD PTR [ebp-0xc],eax

Note that ebp becomes relevant only after the second instruction is executed, and indeed for the first two addresses the base is computed from esp in the location information listed above. Once ebp is valid, it's convenient to compute offsets relative to it because it stays constant while esp keeps moving with data being pushed and popped from the stack.

So where does it leave us with my_local? We're only really interested in its value after the instruction at 0x8048610 (where its value is placed in memory after being computed in eax), so the debugger will be using the DW_OP_breg5: 8 frame base to find it. Now it's time to rewind a little and recall that the DW_AT_location attribute for my_local says DW_OP_fbreg: -20. Let's do the math: -20 from the frame base, which is ebp + 8. We get ebp - 12. Now look at the disassembly again and note where the data is moved from eax - indeed, ebp - 12 is wheremy_local is stored.


阅读(2099) | 评论(0) | 转发(0) |
0

上一篇:test_and_set_bit

下一篇:dwarf简介

给主人留下些什么吧!~~