Hey there, everyone! This is just a quick post to share a debugging technique
I’ve found useful and that I recently added to mb-util. Today I’m talking
about using SIGQUIT
to trigger some kind of debugging action in an
application, such as opening a REPL or logging a stack trace.
The reason SIGQUIT
is useful is that most terminals will send that signal if
you press Ctrl-\ (control and backslash), and many server environments also
have a mechanism for sending UNIX signals.
I’m not the first person to do this of course. My inspiration comes from the JVM, where SIGQUIT will print stack traces and JVM statistics, but I believe the idea goes back further. Here I’ll show how to apply this concept to a Ruby application, but you can do the same thing in any language.
First example: printing stack traces in console and server-side apps
The MB::U.sigquit_backtrace
function I recently added to mb-util
iterates over all Ruby threads and prints a colorized stack trace for each
thread.
Here’s the code; it’s really simple:
1 2 3 4 5 6 7 |
|
Second example: opening a REPL in a console app
I’ve also used SIGQUIT for interactively debugging console apps with a REPL (Read Eval Print Loop, a fancy term for an interactive command line interface that uses the same programming language syntax).
Note that, at least in Ruby, this only works if your app has an interruptible
loop in one of its threads or you create a new thread, as you cannot run Pry
from within a signal handler. It’s also best to run binding.pry
in a context
that has access to variables you want to inspect.
Here’s some example code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
You can run that example script, then press Ctrl-\ to view and manipulate the program using the Pry REPL.
1 2 |
|
In this session I use Ctrl-\ to start Pry, change a local variable, then resume execution. We can see that the updated value of the variable is used in the next loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
|
Note that if you try to run Pry from the trap
signal handler, this happens instead:
1 2 |
|
Why? Some real-world scenarios
Ok, now that you have this new tool in your toolbelt, when would you use it?
Scenario 1: stalling web app
Let’s say you’re running a web app and you have some requests that seem to take forever. But you don’t know where in the code those requests are taking their time, and for some reason your APM (Application Performance Monitor) isn’t catching the culprit. You are running an APM and keeping app logs, right?
In this case, if your code already has a SIGQUIT
handler, you can use your
server infrastructure to send SIGQUIT
to the application process. In
Kubernetes for example, you might use kubectl exec
to open a command line in
the running pod, or if you’ve wisely removed all shells from your containers,
or use kubectl debug
to attach a different container image to the running
pod. Then you run kill -QUIT [app pid]
, e.g. kill -QUIT 1
if your app is
running as PID 1 in the Kubernetes pod.
From here you look at your application logs, find the stack trace printed by
your SIGQUIT
handler. Most likely the stack trace will show you exactly
where your code has stalled.
Scenario 2: debugging a console app
I use Ruby code to generate most of the animations for my YouTube channel,
and sometimes the animation breaks, or I want to tweak the running animation
just a little bit. I’ve regularly used a SIGQUIT
handler to open a Pry REPL
in these cases. With Pry open I can inspect the animation state, change
variable values, etc.
Scenario 3: is RSpec stuck in an infinite loop or just a really slow test?
The most recent case, and the one that prompted this post, was some slow RSpec
tests in mb-math, combined with code that I knew could potentially loop
forever. You probably already know that RSpec just prints a .
to the screen
for each test it runs by default, and runs tests in random order, so it’s not
obvious which test is taking a long time to run.
I added a SIGQUIT
stack trace printer to my spec/spec_helper.rb
using
MB::U.sigquit_backtrace
, then used Ctrl-\ to see what code RSpec was
running. The stack traces allowed me to narrow down the root causes and
correct them.
Alternatives
In some cases you won’t have the ability to send a UNIX signal, your app isn’t
running in a console, or you are using a framework that already interprets
SIGQUIT
differently. No worries — the same concept can be applied in
different ways. The root pattern or principle is allowing an authorized
engineer to trigger a debugging action in the application. You could use a web
endpoint that requires a secure password, a different UNIX signal, etc. Always
keep security in mind, though, if you implement something that can be accessed
remotely.
Summary
I’ve shown two different things you can do with a SIGQUIT
signal handler,
namely printing a stack trace and opening a REPL, but undoubtedly there are
more options. I believe this is a useful tool that every developer should have
in their arsenal.
For more reading:
- The Linux
signal(7)
manual page (orman 7 signal
from your terminal) - My mb-util Rubygem