The GNU Profiler Program
The GNU profiler (gprof) is another program included in the binutils package. This program is used
to analyze program execution and determine where “hot spots” are in the application.
The application hot spots are functions that require the most amount of processing time as the program
runs. Often, they are the most mathematically intensive functions, but that is not always the case.
Functions that are I/O intensive can also increase processing time.
This section describes the GNU profiler, and provides a simple demonstration that illustrates how it is
used in a C program to view how much time different functions consume in an application.
Using the profiler
As with all the other tools, gprof is a command-line program that uses multiple parameters to control
its behavior. The command-line format for gprof is as follows:
gprof [ -[abcDhilLsTvwxyz] ] [ -[ACeEfFJnNOpPqQZ][name] ]
[ -I dirs ] [ -d[num] ] [ -k from/to ]
[ -m min-count ] [ -t table-length ]
[ --[no-]annotated-source[=name] ]
[ --[no-]exec-counts[=name] ]
[ --[no-]flat-profile[=name] ] [ --[no-]graph[=name] ]
[ --[no-]time=name] [ --all-lines ] [ --brief ]
[ --debug[=level] ] [ --function-ordering ]
[ --file-ordering ] [ --directory-path=dirs ]
[ --display-unused-functions ] [ --file-format=name ]
[ --file-info ] [ --help ] [ --line ] [ --min-count=n ]
[ --no-static ] [ --print-path ] [ --separate-files ]
[ --static-call-graph ] [ --sum ] [ --table-length=len ]
[ --traditional ] [ --version ] [ --width=n ]
[ --ignore-non-functions ] [ --demangle[=STYLE] ]
[ --no-demangle ] [ image-file ] [ profile-file ... ]
This alphabet soup of parameters is split into three groups:
❑ Output format parameters
❑ Analysis parameters
❑ Miscellaneous parameters
The output format options, described in the following table, enable you to modify the output produced
by gprof.
Parameter Description
-A Display source code for all functions, or just the functions specified
-b Don’t display verbose output explaining the analysis fields
-C Display a total tally of all functions, or only the functions specified
-i Display summary information about the profile data file
-I Specifies a list of search directories to find source files
-J Do not display annotated source code
-L Display full pathnames of source filenames
-p Display a flat profile for all functions, or only the functions specified
-P Do not print a flat profile for all functions, or only the functions
specified
-q Display the call graph analysis
-Q Do not display the call graph analysis
-y Generate annotated source code in separate output files
-Z Do not display a total tally of functions and number of times called
--function-reordering Display suggested reordering of functions based on analysis
--file-ordering Display suggested object file reordering based on analysis
-T Display output in traditional BSD style
-w Set the width of output lines
-x Every line in annotated source code is displayed within a function
--demangle C++ symbols are demangled when displaying output
The analysis parameters, described in the following table, modify the way gprof analyzes the data contained
in the analysis file.
Parameter Description
-a Does not analyze information about statistically declared (private)
functions
-c Analyze information on child functions that were never called in the
program
Parameter Description
-D Ignore symbols that are not known to be functions (only on Solaris
and HP OSs)
-k Don’t analyze functions matching a beginning and ending symspec
-l Analyze the program by line instead of function
-m Analyze only functions called more than a specified number of times
-n Analyze only times for specified functions
-N Don’t analyze times for the specified functions
-z Analyze all functions, even those that were never called
Finally, the miscellaneous parameters, described in the following table, are parameters that modify the
behavior of gprof, but don’t fit into either the output or analysis groups.
Parameter Description
-d Put gprof in debug mode, specifying a numerical debug level
-O Specify the format of the profile data file
-s Force gprof to just summarize the data in the profile data file
-v Print the version of gprof
In order to use gprof on an application, you must ensure that the functions you want to monitor are
compiled using the -pg parameter. This parameter compiles the source code, inserting a call to the
mcount subroutine for each function in the program. When the application is run, the mcount subroutine
creates a call graph profile file, called gmon.out, which contains timing information for each function in
the application.
Be careful when running the application, as each run will overwrite the gmon.out file. If you want to
take multiple samples, you must include the name of the output file on the gprof command line and
use different filenames for each sample.
After the program to test finishes, the gprof program is used to examine the call graph profile file to
analyze the time spent in each function. The gprof output contains three reports:
❑ A flat profile report, which lists total execution times and call counts for all functions
❑ A listing of functions sorted by the time spent in each function and its children
❑ A listing of cycles, showing the members of the cycles and their call counts
By default, the gprof output is directed to the standard output of the console. You must redirect it to a
file if you want to save it.
A profile example
To demonstrate the gprof program, you must have a high-level language program that uses functions to
perform actions. I created the following simple demonstration program in C, called demo.c, to demonstrate
the basics of gprof:
#include
void function1()
{
int i, j;
for(i=0; i <100000; i++)
j += i;
}
void function2()
{
int i, j;
function1();
for(i=0; i < 200000; i++)
j = i;
}
int main()
{
int i, j;
for (i = 0; i <100; i++)
function1();
for(i = 0; i<50000; i++)
function2();
return 0;
}
This is about as simple as it gets. The main program has two loops: one that calls function1() 100
times, and one that calls function2() 50,000 times. Each of the functions just performs simple loops,
although function2() also calls function1() every time it is called.
The next step is to compile the program using the -pg parameter for gprof. After that the program can
be run:
$ gcc -o demo demo.c -pg
$ ./demo
$
When the program finishes, the gmon.out call graph profile file is created in the same directory. You can
then run the gprof program against the demo program, and save the output to a file:
$ ls -al gmon.out
-rw-r--r-- 1 rich rich 426 Jul 7 12:39 gmon.out
$ gprof demo > gprof.txt
$
Notice that the gmon.out file was not referenced in the command line, just the name of the executable
program. gprof automatically uses the gmon.out file located in the same directory. This example redirected
the gprof output to a file named gprof.txt. The resulting file contains the complete gprof
report for the program. Here’s what the flat profile section looked like on my system:
% cumulative self self total
time seconds seconds calls us/call us/call name
67.17 168.81 168.81 50000 3376.20 5023.11 function2
32.83 251.32 82.51 50100 1646.91 1646.91 function1
This report shows the total processor time and times called for each individual function that was called
by main. As expected, function2 took the majority of the processing time.
The next report is the call graph, which shows the breakdown of time by individual functions, and how
the functions were called:
index % time self children called name
[1] 100.0 0.00 251.32 main [1]
168.81 82.35 50000/50000 function2 [2]
0.16 0.00 100/50100 function1 [3]
-----------------------------------------------
168.81 82.35 50000/50000 main [1]
[2] 99.9 168.81 82.35 50000 function2 [2]
82.35 0.00 50000/50100 function1 [3]
-----------------------------------------------
0.16 0.00 100/50100 main [1]
82.35 0.00 50000/50100 function2 [2]
[3] 32.8 82.51 0.00 50100 function1 [3]
-----------------------------------------------
Each section of the call graph shows the function analyzed (the one on the line with the index number),
the functions that called it, and its child functions. This output is used to track the flow of time throughout
the program.
阅读(1470) | 评论(0) | 转发(0) |