分类:
2008-08-05 07:48:26
The Art of
Software Debugging
--御六气之变,乘天地之正,以游无穷
I. Introduction
In this article, we will
discuss some details about software debugging. As is known to every one, to be
a professional computer programmer, he/she must master the skills of the
debugging. Maybe everybody who understands one or more programming languages
keeps in his/her mind that the skill of debugging is just a trick which is
often used in his/her programming process. However, things are not so simple.
Today, varieties of advanced
programming languages emerge like bamboo shoots after a spring rain. This is
good news in the field of software engineering, nevertheless, as a student who
majors in computer science, I suffer greatly from the encapsulation of codes. For
example, when I want to output some variables in my C program, I must call the
C library function “ printf ”, then what happens after I call “ printf ”? Are
the variables immediately sent to the terminal which I watch on? Or the
variables are stored in the terminal buffer and will be printed on the terminal
afterwards?
And there are some other
questions. When we call “ fprintf ”, we must indicate the I/O device number,
such as stdin, stdout, stderr …, then if we call “
fprintf ” like the following code:
……
fprintf(stdout, “Hello, STDOUT\n”);
fprintf(stderr, “Hello,
STDERR\n”);
……
Which
statement will be first printed on the terminal? Try to think deeply, if we
call “ fprintf ” to output some information on the file descriptor STDERR to
indicate that our program reaches a wrong state, we, maybe not we, but at lease
I , want these information will have a higher priority than these information
printed on STDOUT, because there is an emergency in our program when codes
which call “ fprintf ” on STDERR are executed. Then what will happen if above
codes are executed on some platform, such as Windows or Linux. Try to do it!
So, if we want to know
the details about the “ printf ” function family, there often should be two
ways. One way for us is to try to find the available source code of the C function
library, we are lucky enough today, because GNU project can support this
request for us. The other way is useful without any source code available, this
is debugging at assembly-language level. We can debug any function which we are
interested in. Please remember that the two popular OS platforms,
Windows/Linux, they adopt two different assembly-language format, Windows
employs the x86 format, the AT&T format is well-known in Linux. And there
is another important thing, when we debug the C library functions, “ printf ”
as previously mentioned, deeper and deeper, we will find that the assembly-code
is not available at some point, this is a very popular phenomena, at this point
we need to resort to some kernel-debug tools, such as SoftICE in Windows, KDB
in Linux. Kernel debugging is an advanced topic, we will not discuss it in this
section, and I’m also a stranger in this field.
Even if you are just an
application programmer, sometimes debugging at source code level is not
helpful. For example, the sequence of arguments passing is different between
the C and Pascal programming languages. And we can see that many functions have
prefix “ __fastcall ” in Linux kernel. The prefix “ __fastcall ” means that two
arguments of the function will be passed through the registers not through the
stack, as we all know, the registers-access speed is faster than the
memory-access speed, so prefix “ __fastcall ” is worthy of the name, actually
it is a hint to the compiler. However, if we want to debug these hints, we must
watch the assembly codes very carefully, there are no other ways working.
II. Available debugging
tricks
If someone has been practicing in programming
for just a very short time, it is likely that he/she has mastered a few tricks.
The most popular tricks, I think, are often included in the following list: 1,
insertion of printf statements 2, breakpoints 3, memory/register watching 4,
call stack 5, process control, such as debugging step by step.
No matter you know or not, there are some
other advanced debugging tricks. If you have experienced in using GDB, a default
user-level debugger in Linux, you must have benefited from the “ attach
These previous mentioned tricks are just
guides which are helpful to locate the exact problem-caused instruction, but,
in fact these information we gather from the output of debugger or the log
files does not directly trace back to origination of the bug. So, the essence
of debugging is only how to locate the origination of the bug exactly? However,
we regret to say we cannot answer this question. That is to say, debugging is
beyond the scope of technology, therefore, we consider debugging a kind of art,
something that is unconscious mind in our brain, but we are not able to depict
it in our words.
III. Hardware-level
support
In the Intel CPU family, from the presence of
i386 processor on, IA-32 (Intel Architecture) internally has had eight debug
registers Dr0~Dr7. Except that Dr4 and Dr5 are reserved for future use, the
others are included in the following list: (1) four address registers of 32-bit
length, they are Dr0~Dr3 (2) one control register Dr7 of 32-bit length (3) one
status register Dr6 of 32-bit length, but only 16 bits have been used. If some
reader wants to know the detailed utility of the debug registers, please read
the Intel manual, we do not waste time discussing it here.
The most important thing, we must keep in our
mind, is that we can define the HARDWARE breakpoint with the debug registers.
What will we benefit from this? The popular trick we often use when we are debugging
our program is to insert breakpoints at wherever we are suspicious of, these
breakpoints, we call them SOFTWARE breakpoints here, tells compiler that the
original code should be replaced by an instruction “ INT
By contrast with SOFTWARE breakpoints, when
we use HARDWARE breakpoints, the only thing to do is to store the address of
one variable or one instruction in one of the address registers Dr0~Dr3 and set
some attribute-bits in Dr7 and Dr6, then all the thing is done by the hardware,
the compiler does not need to replace some original instruction with
instruction “ INT
Now, let us see the utility of the Data
Access Breakpoint, this utility is supported by Microsoft VC++, we should
follow this order : (1) click the “ EDIT ” menu (2) select the “ Breakpoint ”
option, then we face a dialog, (3) select the “ DATA ” tab of the dialog, we
can input some statements like “ *((int *)0x
The previous mentioned utility is an
automated tool assembled in VC++, here we can also see how to manipulate the
debug registers manually. The following codes exhibit this trick,
#define _WIN32_WINNT 0x1000
#include
#include
#include
#include
int main(int argc,char * argv[])
{
CONTEXT cxt;
HANDLE hThread =
GetCurrentThread();
DWORD dwTestVar
= 0;
if(!IsDebuggerPresent())
{
printf("The
sample can only run within a debugger!\n");
return
E_FAIL;
}
cxt.ContextFlags
= CONTEXT_DEBUG_REGISTERS|CONTEXT_FULL;
if(!GetThreadContext(hThread,&cxt))
{
printf("Failed
to get thread context!\n");
return
E_FAIL;
}
cxt.Dr0 =
(DWORD)&dwTestVar;
cxt.Dr7 = 0xF0001;
if(!SetThreadContext(hThread,&cxt))
{
printf("Failed
to set Thread context!\n");
return
E_FAIL;
}
dwTestVar = 1;
GetThreadContext(hThread,&cxt);
printf("Break
into Debugger with DR6 = %x!\n",cxt.Dr6);
return S_OK;
}
Note
: The above program of its original-version is coded by 张银奎 who works for Intel in
Shanghai, but the codes you see here are little modified. This program works in
the Windows platform.
IV. Kernel-level support
The previous section describes SOFTWARE
breakpoint, now we have known that if we insert one SOFTWARE breakpoint in our program,
the compiler will be told to replace the original instruction with an
instruction “ INT
First, let us look at the data structure
task_struct, each represents every different process in Linux System. You can
find the detailed information in file /include/linux/sche.h of the Linux Source
Tree; we don’t give the information here. We just concentrate on the field PTRACE,
type of which is unsigned long.
Please recall that to
issue a SOFTWARE breakpoint is just to force the CPU to execute an instruction “
INT
if
((current->
& )
&& signr != )
{
/* Let the debugger run. */
current-> = signr;
current-> = ;
(current, );
();
/* We're back. Did the debugger cancel the sig? */
if (!(signr = current->))
continue;
current-> = 0;
/* The debugger continued. Ignore SIGSTOP. */
if (signr == )
continue;
/* Update the siginfo structure. Is this good? */
if (signr != .si_signo) {
.si_signo = signr;
.si_errno = 0;
.si_code = ;
. = current->p_pptr->;
. = current->p_pptr->uid;
}
/* If the (new) signal is now blocked, requeue it. */
if ((¤t->blocked, signr)) {
(signr, &, current);
continue;
}
}
We can see that the
traced process notify its parent process with signal SIGCHLD, function “
notify_parent ” then calls function “ do_notify_parent ”, which do most of
work. Let’s see the last two lines of code in function “ do_notify_parent ”:
(sig, &, tsk->p_pptr);
(tsk->p_pptr);
The parameter
tsk->p_pptr is a pointer which points the parent process of “ tsk ” process,
function “ wake_up_parent ” wakes up the kernel scheduler to schedule the tsk
process to make it run. Therefore, the parent process, which is a debugger
process, can get the detailed information about the traced process, such as the
current process context. This work is very easily done, for that parent process
has a pointer within the process descriptor, “ task_struct ” as we all know, which
points to the child process descriptor, all we want to know about the child
process can be found in the child process descriptor. Furthermore, the debugger
process has the choice to send a signal back to the traced process; as a
result, the traced process terminates or continues itself according to the
signal it receives.
Many programmers have a
misunderstanding with the debugging, I am used to, deem that when a debugger
process starts to debug another process, all the information displays in front
of my face, such as call stack, values of registers and so on, is extracted form
the address space of traced process by the debugger process forcibly. Now, we
clearly know that in Linux kernel the debugger process do not know the
debugging event until the traced process notify the debugger process with a
signal SIGCHLD. So, in kernel, debugging facility needs complex interaction of
the debugger process and traced process, not just the debugger process forces
the traced process to do something.
Here, we do not explain
the detailed source code in Linux kernel, because we are discussing software
debugging, not the execution flow of kernel source code about any operating
system. Linux is only a platform we do something on, we can also get our jobs
done on Windows platform, and however, Linux source code is available anywhere
as long as we can get connection to Internet so we can download source code
form the official website for free.
Do not forget the PTRACE
field of the “ task_struct ” previously mentioned, this field’s type is
unsigned long, actually, in our discussion, we only see this field is used to
indicate two different status of some a process, TRACED or NON-TRACED, so this
field can be shrunk to only a bit, but C programming language does not provide
this data type.
To support the debugging
facility, Linux kernel must cope with a lot of business logic, this directly
results in much code added in the kernel, and also there is a PTRACE field in
the process descriptor to indicate whether the process is being traced or
debugged. Now, you can see that this design in Linux is definitely ingenious. Do
you appreciate it?
V. Two advanced topics:
Parallel-program debug and Kernel-image debug
If you have written some parallel programs,
you may have found that it is very difficult to debug parallel programs. Why?
Because the potential for introducing subtle timing faults is very considerable,
and if we introduce one, it will take long time to locate it. For example, please
look at the following codes: (assume that the C file’s name is threadExample.c)
1#include
2#include
3#include
4#include
5#define NR_THREAD 5
6void * start_routine(void *);
7int main(int argc, char ** argv)
8{
9 pthread_t
tid[NR_THREAD];
10 int i = 0, res =
0;
11 for(i = 0; i
< NR_THREAD; i++)
12 {
13 res =
pthread_create(&tid[i], NULL, start_routine, (void *)&i);
14 //res = res
= pthread_create(&tid[i], NULL, start_routine, (void *)i);
15 if(res !=
0)
16 {
17 perror("pthread_create
error");
18 exit(-1);
19 }
20 }
21 for(i = 0; i
< NR_THREAD; i++)
22 {
23 res =
pthread_join(tid[i], NULL);
24 if(res !=
0)
25 {
26 perror("pthread_join error");
27 exit(-1);
28 }
29 }
30 return 0;
31}
32void * start_routine(void * args)
33{
34 int id = *((int
*)args);
35 //int id =
(int)args;
36 printf("Thread
id is %d\n", id);
37}
Now, you type "./ threadExample "
on the command line, then what will happen? I think you expect that all threads
created within main-thread would print “ id ” in an ascending sequence.
However, it seems that the results always differ from what you expect. But why
does this happen? We just do every step properly. Where is the offending line? Maybe
the occurrence reminds of you GDB, but, here GDB is not helpful, and you must
examine the source code again and again to spot the offending code. So, what a
painful travel!
At last, if lucky
enough, you locate the bug on Line 13, there every thread is started with an
argument, a different pointer pointing the same variable in main-thread’s
stack, why does this introduce a subtle bug? Imagine that if main-thread runs
very fast, or main-thread is scheduled prior to some of sub-threads, then
main-thread might alter local variable “ i ”, and it is imperceptible to all
sub-threads, when any sub-thread reference “ i ” it just obtains the modified
value. So, if we want to run the above program without any unexpected result,
we shall use the commented lines to replace the original ones. This time, we
create sub-threads by passing a copying argument onto the stack, so each
sub-thread’s argument will not conflict with the one of another.
Besides the timing
fault, there is another issue which we often neglect when we are coding
parallel programs, that is the memory model of modern processors. We don’t
discuss the details about the memory model, here it’s enough to only keep in
mind that memory is organized as a hierarchical structure in modern processor,
the closer memory is away from the processor, the higher of access speed and
the higher of price, vice versa.
If you code your program
without careful attention to the memory model, there will be something subtle similar
to timing fault. To understand this, please look at the following codes:
#include
#include
#include
#include
#define NR_THREAD 1
void * start_routine(void *);
int globalvar = 0;
int main(int argc, char ** argv)
{
pthread_t
tid[NR_THREAD];
int i = 0, res =
0;
for(i = 0; i
< NR_THREAD; i++)
{
res =
pthread_create(&tid[i], NULL, start_routine, NULL);
if(res !=
0)
{
perror("pthread_create
error");
exit(-1);
}
}
printf("Hello
GlobalVar!\n");
while(globalvar
== 0)
continue;
printf("Goodbye
GlobalVar!\n");
for(i = 0; i
< NR_THREAD; i++)
{
res =
pthread_join(tid[i], NULL);
if(res !=
0)
{
perror("pthread_join
error");
exit(-1);
}
}
return 0;
}
void * start_routine(void * args)
{
sleep(3);
globalvar = 1;
}
/*compiling command: gcc –g –O2
threadExample.c –o threadExample*/
What you expect is that
the main-thread will exit just as usual, but the fact is the program will exist
in system for ever without receiving SIGKILL, SIGTERM… To find out what’s
really going on, we must watch its assembly code:
0x08048584
0x08048589
0x08048590
0x08048592
I only extract the most
concentrated lines of code here, and 0x8049834 is the address of global
variable “ globalvar ”. We can see that “ globalvar ” is buffered in eax
register and every time the main-thread just test the buffered value in eax,
not the value in memory, but, actually the value of “ globalvar ” is modified in the
sub-thread which is not visible to main-thread. So, what’s the problem? Why
does main-thread just test the buffered value not the original value in memory?
Please notice that we type the compiling command like that “gcc –g
–O2 threadExample.c –o threadExample ”. Oh, here we direct
GCC to optimize our program. Therefore, GCC will buffer some common-used
variables on the register file, yet, GCC is not intelligent enough to keep
track of which variable must be synchronized every time it is accessed, and
then main-thread cannot jump out of the loop code. Well, how can we tell GCC
which variable should be synchronized every time accessed? It’s very simple,
just add a key word “ volatile ” in front of the variable when defined.
Now, game over? No.
There is another funny thing that will attract every intensely-curious
programmer’s attention. If we type “gcc –g
threadExample.c –o threadExample” on the command line, although we do not impose
GCC to synchronize “ globalvar ” every time it is accessed, the main-thread can exit just
as normal. Again, let’s watch it’s assembly code:
0x08048584
0x08048589
0x0804858b
Here, we see that “ globalvar ” is also buffered in
eax register, but, every time accessed main-thread reread its value from memory
not from eax register, so is just exit without something abnormal. Whereas, if
some variable is accessed within two or more threads, it’s a good convention to
define it with key word “ volatile ”.
We move on to talk something about Kernel-image
debug. What is “ Kernel-image debug ”? Anytime we boot our operating system
from a disk, the loader program would eventually load all codes and data which are
necessary for running a system from disks to memory, so the running system in
memory, including kernel, is just an image of codes and data stored on a disk. “
Kernel-image debug ” is to debug the running kernel in memory. And there are
some tools to help us get this done, such as KDB in Linux. When we are working
on this job, we must be very careful to guarantee that anything the running
kernel relies on is consistent, if not, the whole system will crash in a short
time. Here I’m sorry to say that I’m not familiar about how to use KDB, so, we
will get back to discuss this after I master this tool.
VI. A sample debugger
In Linux platform, we can use an interface
function named “ ptrace ”, which is defined in /usr/include/sys/ptrace.h, in
our own program. From manual pages, we know that whenever we want our own
program to trace another program (process, accurately speaking) the very thing
is just to request the ptrace function and then to verify whether this request
can be fulfilled. Of course, we need some other system calls to make our
program efficient and robust. Now, let’s show a sample debugger which is
implemented with Linux ptrace interface, you can find the source code in the
Appendix section.
From these codes, you can see it’s not difficult
to design and implement a useful debugger. Oh, sorry, here we must cut in to
say something others. In my Linux platform (SUSE 10.1 version), there is a very
strange inconsistency between manual pages and glibc header files about ptrace
interface. Some requests expressly indicated by manual pages, such as PTRACE_SETOPTIONS
and PTRACE_GETEVENTMSG, cannot be found in /usr/include/sys/ptrace.h, the
header file which defines the ptrace interface, and this means we are not able
to send some kind of request to ptrace function. However, as we dig into Linux
kernel, we can found these requests defined in /usr/src/linux-x.x.x/include/linux/ptrace.h
and there is also a function prototype like this:
extern int
ptrace_request(struct task_struct *child, long request, long addr, long data)
It seems that when we do kernel development
we can call this function in replacement of ptrace which is exported to user
space. So, why does the manual pages differ from glibc header files? I don’t
know. If some reader knows, please give me a piece of mail. Thank you!
VII. Conclusion
Nowadays, we can get varieties of development
tools from the Internet, including debugging tools. However, even the most
convenient tool, is just assistance to help us to deal with problems more
efficiently. Essentially, the answer to the questions, how to keep track of the
cause of problem and how to 驾御 ant tool available for
us to help us, is still left us, human beings, to seek. On the way of nonstop
pursuing the answer, we improve ourselves again and again, and this makes us to
be human beings, neither animals nor other things.
VIII. Appendix
/**
* file:trace.c
* author:XXX
* date:
*
* note: Although we define a subfunction named
"getMainEntryPoint", actually, the address
*
it gets is not the main() function entry point, but some virtul address
prior to
*
main() function. Futhermore, there are still some bugs to be fixed.
However, my free
*
is so limited, yet, I really want some reader to fix bugs for me. And
you are quarl-
*
-ified to modify any lines of code below. If you do, please send email
to notify me.
*
My email address is zhucj041070075@gmail.com, I'm looking forward to
your letters.
*/
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
#define
INPUTLINE 64
void
usage();
void
command();
void
ptraceErrCheck(int res, enum __ptrace_request req);
int
getUserRegs(pid_t pid, struct user_regs_struct * regs, int verbose);
void
setUserRegs(pid_t pid, struct user_regs_struct * regs);
char
** CreateExecArgv(int argc, char ** argv);
void
FreeExecArgv(int argc, char ** argv);
void
calAddress(char * comm, unsigned long * bpAddr);
int
getMainEntryPoint(FILE * Elf_fp, unsigned long * bpAddr);
int
main(int argc, char ** argv)
{
if(argc < 2) {
usage();
return 0;
}
FILE * fp;
int stat_loc; long res; char comm[INPUTLINE];
pid_t pid;
long oldInstruct, newInstruct;
int IsInterrupted = 0; unsigned long
bpAddr;//main entrypoint
struct user_regs_struct regs;
memset(®s, 0, sizeof(struct
user_regs_struct));
if((fp = fopen(argv[1],
"r")) < 0) {
fprintf(stderr,
"File not exist!\n");
return -1;
}
else {
int ret =
getMainEntryPoint(fp, &bpAddr);//get Main EntryPoint
switch(ret) {
case 0:
fclose(fp);
break;
case
-1://file read error
fprintf(stdout,
"file read error!\n");
fclose(fp);
return
-1;
case
-2://data format error
fprintf(stdout,
"data format error!\n");
fclose(fp);
return
-1;
}
}
char ** ExecArgv =
CreateExecArgv(argc, argv);
if(!ExecArgv) {
fprintf(stderr,
"Failed to allocate memowy!\n");
return -1;
}
TRACEHERE:
pid = fork();
if(pid < 0) {
fprintf(stderr,
"Fork error!\n");
return -1;
}
else if( pid == 0) { //child
int ret =
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
ptraceErrCheck(ret, PTRACE_TRACEME);
execvp(ExecArgv[0] ,
ExecArgv);
}
else { //parent
res = waitpid(pid,
&stat_loc, 0);
fprintf(stdout,
"Begin to trace program %s!\n", ExecArgv[0]);
while(1) {
fprintf(stdout,
"Command: ");
fgets(comm,
INPUTLINE, stdin);
if(comm[0]
== 'h') {//help
command();
}
else
if(comm[0] == 'b') {//break
IsInterrupted
= 1;
calAddress(comm,
&bpAddr);
oldInstruct
= ptrace(PTRACE_PEEKTEXT, pid, bpAddr, NULL);
ptraceErrCheck(oldInstruct,
PTRACE_PEEKTEXT);
/**
*here I'm not sure whether it's always
successful to midify the
*instructure code like this, maybe we will be
signaled with SIGILL,
*so this is a subtle bug.
*/
newInstruct
= 0xcccccccc;
res
= ptrace(PTRACE_POKETEXT, pid, bpAddr, newInstruct);
ptraceErrCheck(res,
PTRACE_POKETEXT);
res
= ptrace(PTRACE_CONT, pid, NULL, NULL);
ptraceErrCheck(res,
PTRACE_CONT);
waitpid(pid,
&stat_loc, 0);
if(WIFSTOPPED(stat_loc))
{
int
signal = WSTOPSIG(stat_loc);
if(signal
== SIGTRAP)
fprintf(stdout,
"breakpoint at 0x%x \n", bpAddr);
else
fprintf(stdout,
"Program %s interrupted by signal %d\n",
ExecArgv[0],
signal);
getUserRegs(pid,
®s, 1);
}
}
else
if(comm[0] == 'r') {//run
res
= ptrace(PTRACE_CONT, pid, NULL, NULL);
ptraceErrCheck(res,
PTRACE_CONT);
res
= waitpid(pid, &stat_loc, 0);
fprintf(stdout,
"Program %s exit with code %d\n",
ExecArgv[0],
WEXITSTATUS(stat_loc));
goto
TRACEHERE;
}
else if(comm[0]
== 'c') {//continue
if(!IsInterrupted)
{
fprintf(stdout,
"program %s is not being running!\n", ExecArgv[0]);
continue;
}
IsInterrupted
= 0;
res
= getUserRegs(pid, ®s, 0);//Get the context of being ptraced process
/**
*x86 instructioin CC(INT 3) cause a TRAP,then
eip is increased to
*point to the next instruction.So, here we
must decrease eip to
*ensure it points to the trap-caused
instruction.
*/
if(res)
{
continue;
}
regs.eip--;
setUserRegs(pid,
®s);//set back context to the being ptraced process
res
= ptrace(PTRACE_POKETEXT, pid, bpAddr, oldInstruct);
ptraceErrCheck(res,
PTRACE_POKETEXT);
res
= ptrace(PTRACE_CONT, pid, NULL, NULL);
ptraceErrCheck(res,
PTRACE_CONT);
waitpid(pid,
&stat_loc, 0);
if(WIFEXITED(stat_loc))
{
fprintf(stdout,
"Program %s exit with code %d\n",
ExecArgv[0],
WEXITSTATUS(stat_loc));
}
else
if(WIFSTOPPED(stat_loc)) {
fprintf(stdout,
"Program %s interrupted by signal %d\n",
ExecArgv[0],
WSTOPSIG(stat_loc));
}
goto
TRACEHERE;
}
else
if(comm[0] == 'k') {//kill
res
= ptrace(PTRACE_KILL, pid, NULL, NULL);
ptraceErrCheck(res,
PTRACE_KILL);
fprintf(stdout,
"program %s terminated!\n", ExecArgv[0]);
goto
TRACEHERE;
}
else
if(comm[0] == 'q') {//quit
res
= ptrace(PTRACE_KILL, pid, NULL, NULL);
ptraceErrCheck(res,
PTRACE_KILL);
fprintf(stdout,
"Tracer quit!\n");
break;
}
else {
fprintf(stderr,
"Unknown Command!\n");
}
}
}
FreeExecArgv(argc, ExecArgv);
return 0;
}
void
usage()
{
fprintf(stdout, " usage: trace
[filename] [parameters]\n");
}
void
command()
{
fprintf(stdout, " command
usage: b(break) *addr\n"
" : c(continue) \n"
" : r(run) \n"
" : h(help) \n"
" : k(kill) \n"
" : q(quit) \n");
}
void
ptraceErrCheck(int res, enum __ptrace_request req)
{
if(res < 0 && errno != 0)
{
perror("PTRACE
error:");
switch(req) {
case
PTRACE_KILL:
return;
default:
exit(-1);
}
}
}
int
getUserRegs(pid_t pid, struct user_regs_struct * regs, int verbose)
{
long res = ptrace(PTRACE_GETREGS,
pid, NULL, (void *)regs);
if(res < 0 && errno != 0)
{
perror("PTRACE
error:");
return -1;
}
if(verbose) {
fprintf(stdout,
"registers infomation:\n");
fprintf(stdout, " eax: 0x%x\n",
regs->eax);
fprintf(stdout, " ecx: 0x%x\n",
regs->ecx);
fprintf(stdout, " edx: 0x%x\n",
regs->edx);
fprintf(stdout,
" ebx: 0x%x\n", regs->ebx);
fprintf(stdout, " esp: 0x%x\n",
regs->esp);
fprintf(stdout, " ebp: 0x%x\n",
regs->ebp);
fprintf(stdout, " esi: 0x%x\n",
regs->esi);
fprintf(stdout, " edi: 0x%x\n",
regs->edi);
fprintf(stdout, " eip: 0x%x"
"<--Here eip
points to the next instruction\n", regs->eip);
fprintf(stdout,
" eflags: 0x%x\n", regs->eflags);
fprintf(stdout, " cs: 0x%x\n",
regs->cs);
fprintf(stdout, " ss: 0x%x\n",
regs->ss);
fprintf(stdout,
" ds: 0x%x\n", regs->ds);
fprintf(stdout,
" es: 0x%x\n", regs->es);
fprintf(stdout,
" fs: 0x%x\n", regs->fs);
fprintf(stdout,
" gs: 0x%x\n", regs->gs);
}
return 0;
}
void
setUserRegs(pid_t pid, struct user_regs_struct * regs)
{
long res = ptrace(PTRACE_SETREGS,
pid, NULL, (void *)regs);
if(res < 0 && errno != 0)
{
perror("PTRACE
error:");
exit(-1);
}
}
void
calAddress(char * comm, unsigned long * bpAddr)
{
//example:comm = b 0xffffffff
//we must strip all the prefix
//here we do not check the validity
of Address
char * index = comm;
while(*index != '\0') {
if(!strncmp(index,
"0x", 2)){
*bpAddr =
(unsigned long)strtol(index + 2, NULL, 16);
printf("%x\n",
*bpAddr);
return;
}
index++;
}
}
int
getMainEntryPoint(FILE * Elf_fp, unsigned long * bpAddr)
{
Elf32_Ehdr elf_header;
if(fread(&elf_header,
sizeof(Elf32_Ehdr), 1, Elf_fp) != 1)
return -1;//file read
error
unsigned char * field = (unsigned
char *)&elf_header.e_entry;
switch(elf_header.e_ident[EI_DATA])
{
case ELFDATA2LSB:
* bpAddr =
((unsigned long)(field[0]))
|
(((unsigned long)(field[1])) << 8)
|
(((unsigned long)(field[2])) << 16)
|
(((unsigned long)(field[3])) << 24);
return
0;//success
case ELFDATA2MSB:
* bpAddr =
((unsigned long)(field[3]))
|
(((unsigned long)(field[2])) << 8)
|
(((unsigned long)(field[1])) << 16)
|
(((unsigned long)(field[0])) << 24);
return
0;//success
default:
fprintf(stderr,
"Unknown data format!\n");
return
-2;//data format error
}
}
char
** CreateExecArgv(int argc, char ** argv)
{
char ** execArgv = (char
**)malloc(sizeof(char *) * argc);
if(!execArgv) {
return NULL;
}
int i, size;char * index;
for(i = 1; i < argc; ++i) {
size = sizeof(argv[i]) +
1;
if(i == 1)
size += 2;
execArgv[i - 1] = (char
*)malloc(sizeof(char) * size);
if(!execArgv[i - 1]) {
return NULL;
}
index = execArgv[i - 1];
if(i == 1) {
if(argv[i][0]
!= '/' && argv[i][0] != '.') {
strcpy(execArgv[i
- 1], "./");
index
= execArgv[i - 1] + 2;
}
}
strcpy(index, argv[i]);
}
execArgv[argc - 1] = (char *)0;
return execArgv;
}
void
FreeExecArgv(int argc, char ** argv)
{
int i;
for(i = 0; i < argc; ++i) {
if(!argv[i])
free(argv[i]);
}
free(argv);
}