GDB, please set a breakpoint on all my functions
Friday, February 10, 2017 - No comments
For studying a piece of code, a colleague of mine wanted to set a breakpoint on all the functions of its applications. All but those in shared libraries, so not the libc, libpthread, etc. We agreed that this means all the functions defined in files compiled with -g.
TL;DR;
==> breakpoint_all.py <==
and from GDB:
(gdb) source breakpoint_all.py
I don't claim it's the best way to do, nor the most efficient tracing, I just highlight a possibility of gdb+python. Feel free to update the last step (or the others) to let it do what ever you need!
In the command line, GDB can list the functions with info functions:
(gdb) info functions File ../../gdb/gdb/compile/compile.c: void _initialize_compile(void); int compile_register_name_demangle(struct gdbarch *, const char *); char *compile_register_name_mangled(struct gdbarch *, int); void eval_compile_command(struct command_line *, char *, enum compile_i_scope_types); static int check_raw_argument(char **); File ../../gdb/gdb/complaints.c: void _initialize_complaints(void); void clear_complaints(struct complaints **, int, int); void complaint(struct complaints **, const char *, ...); void internal_complaint(struct complaints **, const char *, int, const char *, ...);
But at the moment (gdb 7.12.1) there is no equivalent in Python, so I had to find another way. I didn't want to parse this command, and create a breakpoint out of these function names, I wanted GDB Python to give me this list, so I had to find another way ...
List source files
Again there is a CLI command (info sources), again there is no Python wrapper:
(gdb) info sources Source files for which symbols have been read in: gdb/amd64-tdep.c, gdb/features/i386/amd64.c, gdb/features/i386/amd64-avx.c, gdb/gdbarch.h, gdb/target.h, ... Source files for which symbols will be read in on demand: gdb/gdb/features/i386/amd64.c, gdb/features/i386/x32.c, ...
I don't why there are two sets, but it's these files we are interested in ... so let's parse this output:
def get_file_addresses(): sources = gdb.execute("info sources", to_string=True).split("\n") assert "Source files for which symbols have been read in" in sources[0] for line in sources: if line.startswith("Source files ") or not line.strip(): continue for source in line.split(", "): # do something with `source` ...
Find source file memory addresses
Now we need to list the symbols in this files. Sounds easy, but in practice there is no function to lookup a symbol file from its name, and even with the symbol files (tables), there is no way to list the symbol it contains. So this is a dead end!
After a careful reading of the documentation, one finds out that only blocks contain a list of symbols, when you iterate over it. This is not written in bold, so easy to miss :-(. Anyway, we need to get the block corresponding to our source file scope. We could lookup it up from its PC: gdb.block_for_pc(pc), but at the moment we don't know a PC corresponding to this file.
One way to get this PC is by setting a breakpoint in this file: break <source file>:1 should work, and we'll be able to use the address of the breakpoint as a block lookup PC.
(gdb) b gdb/amd64-tdep.c:1 Breakpoint 1 at 0x461300: file ../../gdb/gdb/amd64-tdep.c, line 1.
But again, there is no way to do this in Python :-( So let's parse again the CLI output!
Pay attention to error though, if GDB could not convert the location into an address it will through a gdb.error exception, or if set breakpoint pending is on (in my system for instance), it will not fail, but indicate that the breakpoint is pending. Setting multiple breakpoint at the same address may also change the output.
def get_file_addresses(): .... # do something with `source` try: # fails if source is a header file bpt_msg = gdb.execute("break {}:1".format(source), to_string=True) except gdb.error as e: # if show breakpoint pending ==> off continue bp_id = bpt_msg.partition("Breakpoint ")[-1].partition(" ")[0] gdb.execute("delete {}".format(bp_id), to_string=True) if "pending" in bpt_msg or "No line" in bpt_msg: # if show breakpoint pending ==> on """No line 1 in file "/usr/include/bits/pthreadtypes.h". Breakpoint 8 (/usr/include/bits/pthreadtypes.h:1) pending.""" continue """Note: breakpoint 3 also set at pc 0x461300. Breakpoint 2 at 0x461300: file ../../gdb/gdb/amd64-tdep.c, line 1.""" bp_line = [a for a in bpt_msg.split("\n") if a.startswith("Breakpoint ")][0] file_1st_addr = int(bp_line.split(" ")[3][:-1], 16) # change to long in Py2 IIRC yield source, file_1st_addr
List all the symbols/functions from a file
Now we've got the address on a file, we can get its global scope (Block.global_block) and list the symbols it contains. In this list, we just trim out what isn't a function:
def get_all_functions_from_pc(pc): block = gdb.block_for_pc(pc) for symb in block.global_block: if not symb.is_function: continue yield symb
And we're almost there, just need to assemble the different pieces:
for source, pc in get_file_addresses(): print("{} ==> {}".format(source, hex(pc))) for fct_symb in get_all_functions_from_pc(pc): # do something with fct_symb ...
Set a (trace) breakpoint on every functions
Last step, what do we want to do with this symbol: here, just some simple tracing each time the function is called:
class TraceBreakpoint(gdb.Breakpoint): def __init__(self, symb): addr = int(symb.value().address) gdb.Breakpoint.__init__(self, "*{}".format(hex(addr)), internal=True) self.silent = True def stop(self): caller = gdb.newest_frame().older() caller_name = caller.name() if caller else 'none' print('{};{};{}'.format(gdb.selected_thread().num, caller_name, gdb.newest_frame().name())) return False # we never want to stop
which gives this last function:
def set_trace_bpt_on_all_symbols(): for source, pc in get_file_addresses(): print("{} ==> {}".format(source, hex(pc))) for fct_symb in get_all_functions_from_pc(pc): bpt = TraceBreakpoint(fct_symb) print("\t{} (Bpt #{})".format(fct_symb, bpt.number))
Load it
==> breakpoint_all.py <==
and from GDB:
(gdb) source breakpoint_all.py
Multithreaded warning
We noticed that in multithreaded environments, if one thread calls a breakpointed function very often, the other won't have time to progress. Consider running in async mode to stop only the thread that hit the breakpoint:
(gdb) set non-stop on