(gdb) break *0x972

Debugging, GNU± Linux and WebHosting and ... and ...

Bug with multiple threads running *inside* GDB

When extending through its Python API (or directly in C), the situation where you have to use threads may pop up. For instance with GUIs, or when using another library or module.

Unfortunately, GDB doesn't like that much. First of all, you cannot call GDB Python functions from another thread. GDB itself not multithreaded, and hence not thread safe. Python is though, so you should be able to block the main thread in Python, and call GDB functions in the other thread, but outside from that, GDB will simply crash! And there is nothing to do against that, as far as I know.

But worth (kind of), GDB doesn't support that your code spawns a thread.

TL;DR: solution

In C (GDB bug #17247, patch and discussion):

sigemptyset (&sigchld_mask);
sigaddset (&sigchld_mask, SIGCHLD);
sigprocmask (SIG_BLOCK, &sigchld_mask, &prev_mask);

scm_with_guile (call_initialize_gdb_module, NULL);
sigprocmask (SIG_SETMASK, &prev_mask, NULL);

In Python:

import pysigset, signal

with pysigset.suspended_signals(signal.SIGCHLD):
    # start threads, they will inherit the signal mask
    pass

Description of the bug

When you create (in Python) a thread, and then run the application (in my case it happens mainly when the application itself spawns threads), GDB freezes with the following callstack:

(gdb) where
#0 sigsuspend () from /usr/lib/libc.so.6
#1 wait_lwp (lp=lp@entry=0x21f63b0) at ../../gdb/gdb/linux-na
#2 stop_wait_callback (lp=0x21f63b0, data=<optimized out>) at
#3 iterate_over_lwps (filter=..., callback=callback@entry=0x4
#4 linux_nat_wait_1 (ops=<optimized out>, target_options=1, o
#5 linux_nat_wait (ops=<optimized out>, ptid=..., ourstatus=0
#6 thread_db_wait (ops=<optimized out>, ptid=..., ourstatus=0
...

Explanation of the bug

The function in which GDB is blocked is sigsuspend:

NAME

sigsuspend, rt_sigsuspend - wait for a signal

SYNOPSIS

int sigsuspend(const sigset_t * mask);

DESCRIPTION

sigsuspend() temporarily replaces the signal mask of the calling process with the mask given by mask and then suspends the process until delivery of a signal whose action is to invoke a signal handler or to terminate a process.

GDB uses this function to wait for new events from the application execution: when something occurs in the debuggee (see how debuggers work), the kernel will inform GDB of it by sending a SIGCHLD signal. When it's received, GDB awakes and check what happened.

However, the signal is delivered to GDB process, but not necessarily to its main thread. And it practise, it occurs often that it's delivered to the second thread, who doesn't care about it (that's the default behavior), and continues its life as if nothing occurred.

Solution of the problem

We cannot change the behavior of the thread. However, we have a bit of control over its default signal handling behavior: it is inherited from its parent! So, in Python, we can go this way:

import pysigset, signal

# with SIGCHLD blocked,
with pysigset.suspended_signals(signal.SIGCHLD):
    # start threads,
    # they will inherit the signal mask
    pass
# SIGCHLD is unblocked after the with statement,
# so that GDB can operate properly afterwards