- Ruby's range syntax has lower precedence than method invocation, so
1..16.sizereturns theRange1..8(probably1..4on 32-bit systems), while(1..16).sizereturns theInteger16. Because of this, we need to wrap a range in parentheses or store it in a variable when we want to call a method on the range. - I missed the parentheses around a range during a late night coding session.
- Ruby parsed this as a flip-flop operator but my program appeared to work normally.
- My test suite uses SimpleCov to gather code coverage metrics.
- Ruby 3.4 switched to the PRISM parser.
- The PRISM compiler was generating a line number of zero for flip-flop operators.
- The Ruby code coverage system stores line counts in an
Array. - A line number of zero would reference memory before the start of the
Array. - Memory allocators use tags before and/or after allocated blocks to record information about the allocation.
- Writing the line count for line zero corrupted the heap by overwriting the allocator's tags.
- The program would crash some time later when the garbage collector deallocated memory and libc found the heap corruption.
- mb-sound issue 36
- Ruby issue 21220
- Ruby issue 21259
Back in March of 2025 I started what should have been a straightforward Ruby version upgrade. Little did I know I was about to fall down a multi-day rabbit hole that would ultimately reveal a memory corruption bug (now fixed) deep within core Ruby systems.
Often when you make a typo or miss an operator in code the program simply doesn’t work. But there are those rare cases where you get a subtle, insidious change that appears to work correctly. This is the latter case, with missing parentheses leading to a crash due to memory corruption.
Here’s a bit of foreshadowing (did you know that Ruby has an operator called the flip-flop operator?):
1 2 3 | |
How will these two lines behave?
Keep reading to see how code coverage, automated testing, uncommon operators, and out-of-bounds array writes all converge in a very unexpected way.
The setting
There’s this library I wrote called mb-sound. I use it to generate sound
and visualize audio processing systems for my videos on YouTube. One of the
tools in that library is called midi_roll.rb, and it generates a
visualization of the notes in a MIDI music file in the terminal.
MIDI streams are divided into 16 channels, each of which can play a different
instrument or a different “hand” on the same instrument. midi_roll.rb has a
--channel option to select a specific MIDI channel. I wrote a check to
ensure the channel number is within range:
1
| |
But that line should have looked like this:
1
| |
I did not write tests for what happens if you give an invalid value to
--channel, and as you’ll see, it’s pretty lucky that I didn’t.
Tests
Automated testing is a huge time saver. My approach to testing mb-sound is a pragmatic, gray-box style at a mix of abstraction layers. What matters to me is code coverage, functionality coverage, and application usability, so I’ll write tests from multiple angles that don’t always neatly fall into “unit” or “integration” categories.
Since many of the features of mb-sound are used extensively by the tools in
bin/, and these scripts are an important part of interacting with mb-sound, I
usually write tests for these scripts as well.
Here’s a simple example of a bin/ test from mb-sound:
1 2 3 4 5 6 7 8 9 | |
A simple test for a standalone executable script.
This test makes three assertions:
- The
bin/midi_info.rbscript exits with a successful status code. - The script output includes the word “Events” (part of the table header).
- The script output includes the word “Unnamed” (part of the table contents).
This looks basic but it actually verifies several aspects of the script and of mb-sound:
- In order for that script to exit successfully, the entire library has to parse (no syntax errors), the script itself must be correct Ruby code, and there must be no errors during the entire process.
- To be able to print the word Events, the data for the table must be passed correctly into the function from mb-util that draws tables. So we’re also testing the script correctness here.
- And to show the word Unnamed, the MIDI file must be loaded correctly because that’s the name of a track in the MIDI file. Thus we’re testing the MIDI subsystem of mb-sound.
So it’s worth having these tests to cover large areas of code without writing tons of test cases.
Code coverage
I use SimpleCov code coverage metrics when developing a large new feature to decide what tests to write. I usually don’t target 100% coverage of everything, especially for a hobby project. Instead I review the line-by-line coverage of the files I’m working on just to make sure I’m hitting all of the happy paths (a “happy path” is the normal flow of code when no special options are passed and no errors occur), and any of the other paths that seem really likely or important.
As I was developing more and more scripts that use mb-sound, I really wanted to
be able to include those scripts in my code coverage metrics. Why? Look at
the midi_info.rb example test above – it touches several different parts of
the MIDI subsystem, so we should count that as test coverage for the MIDI
subsystem.
Unfortunately the bin/ tests run as standalone processes, while SimpleCov is
only loaded in the main RSpec process. So I needed a way to load SimpleCov in
the script processes as well. What I landed on was using the RUBYOPT
environment variable to inject SimpleCov into the scripts I test. I have this
line in my spec_helper.rb file:
1
| |
And in simplecov_helper.rb, which that RUBYOPT line injects into the
subprocess, we start SimpleCov and then load the script file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | |
This setup lets me include all of my standalone tools in the code coverage metrics for mb-sound.
The livestream
Okay, now you have the key background info for understanding this bug. You
know what mb-sound is and how I test it, and that there was a script in
mb-sound called midi_roll.rb with a --channel option that wasn’t covered by
automated tests. Let’s get into the actual discovery of the bug.
For a long time I was using Ruby 2.7 by default and testing versions through 3.3 in CI. I planned a livestream in March of 2025 where I would go through all of the changes necessary to add Ruby 3.4 to mb-sound and its dependencies (mb-util, mb-math, etc.). My process was basically switch Ruby versions, run the test suite, and see what breaks.
When I got to the tests for mb-sound under Ruby 3.4, I kept getting test
failures for midi_roll.rb, with Aborted (core dumped) in the process
output:
1 2 | |
Since the tool appeared to work just fine when I ran it standalone, and it passed all tests in other Ruby versions, I disabled SimpleCov in subprocesses for Ruby 3.4 so I could finish the livestream. But I really want coverage metrics from subprocesses and I don’t like letting a root cause go undiscovered, so I logged the crash in mb-sound to revisit later.
The investigation
It’s generally best to assume that bugs are your own fault, and not the fault of the programming language or compiler. This is only the second time in my 20+ year career that I’ve found an actual bug in the language. I approached my investigation into this issue with the assumption that I had done something wrong in my own Ruby or C code.
I’ve pieced these events together from my Git history and my comments on mb-sound issue 36, so if you want you can follow the timeline there while you read here.
Apr. 4, 2025
Initially I had no idea what was causing Ruby 3.4 to crash when testing
midi_roll.rb, so I started by making a copy of mb-sound and deleting as much
code as I could until the crash stopped happening. This is when I found that
having a large coverage stats directory makes the crash more likely.
After trimming down a copy of mb-sound to the point where the munmap_chunk crash stopped happening, I found that having a large coverage/ directory made the bug extremely likely to happen, but removing coverage/ made the bug extremely unlikely to happen.
Apr. 5, 2025
Now that I had a more minimal test case, I next looked for resources to expand my debugging capabilities. I’m familiar enough with tools like pry-byebug, gdb, and Valgrind, but wanted to both refresh my memory and get a broader list of ideas and tools for digging into the issue. Here are the resources I found (these are also listed on the mb-sound issue):
Resources that might be useful for debugging:
- https://www.aha.io/engineering/articles/debugging-ruby-the-hard-way
- https://github.com/ruby/tracer
- https://www.gnu.org/software/libc/manual/html_node/Heap-Consistency-Checking.html
- https://www.man7.org/linux/man-pages/man3/mallopt.3.html
- https://sourceware.org/gdb/current/onlinedocs/gdb.html/Output.html
libc_malloc_debug
Glibc provides an alternative malloc implementation that does more checks to help find memory allocation and deallocation mistakes, so that was my next test:
1 2 3 | |
1 2 | |
This confirmed there was memory corruption going on (“memory corruption” was
printed), rather than just an incorrect call to free() or something (“invalid
pointer” from the earlier error). Unfortunately it didn’t move me much closer
to finding the source of the corruption. One of the tricky things about memory
corruption is that the crash often happens much later than the actual trigger,
making it particularly difficult to find the root cause.
gdb
Ruby is written in C, so next I loaded Ruby into gdb, the standard C debugger
on Linux, to start looking at the C layer. Here’s when I learned that the
crash was happening inside Ruby’s garbage collector. I found that I couldn’t
generate Ruby backtraces by setting a breakpoint, because generating a Ruby
backtrace allocates memory, and you can’t allocate memory while garbage
collecting memory.
I wasn’t able to generate Ruby backtraces for each Ruby thread using the example linked above, because the bug occurs within Ruby’s GC.
Instead, I wrote a GDB script to run the program and print the Ruby stack
traces after the SIGABRT signal is raised. I included some embedded
Python code generated by Google Search’s AI summary to count the number of
running threads because I couldn’t find the documentation for GDB’s Python API
or any other way to get a thread count in a GDB script. I really prefer
writing code myself based on reading docs, but again, I couldn’t find the docs
in Search, yet somehow the AI had this info.
I added this script to the automated test suite in the minimized test case repo, and copied the script’s output to the mb-sound issue.
The C stack trace showed in more detail that the crash happened during GC (garbage collection):
1 2 3 4 5 | |
The Ruby stack trace showed the crash happened within SimpleCov’s code that saves updated results, which shows why having a larger coverage directory made the crash more likely.
Valgrind
Like I mentioned above, memory corruption is tricky because the crash or bad
behavior usually happens long after the initial corruption. That’s where
Valgrind comes in. Valgrind has a bunch of tools for testing
memory allocation and multithreaded lock management. Its default tool, called
memcheck, can find double-free, use-after-free, out-of-bounds access, memory
leaks, etc. Often, it can even tell you which line of code allocated the block
of memory, and which line of code wrote outside that block.
Most of my C projects have an option to run the test suite under Valgrind because it’s so useful.
I didn’t realize it yet, but Valgrind pointed directly at the root cause:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | |
It was almost midnight so my brain was too fried, and I was too unfamiliar with the Ruby codebase, to understand what Valgrind was telling me.
Apr. 6, 2025
The next day I spun my wheels for a while longer looking at garbage collection,
but eventually had the good sense to switch to looking at what Valgrind was
telling me about update_line_coverage().
I’ve confirmed that Ruby 3.3.5 does not crash, so a good next step might be comparing the source code of the functions mentioned by Valgrind between 3.3.5 and 3.4.2.
…
The
update_line_coveragefunction hasn’t changed between 3.3.5 and 3.4.2, but when I build with debug info and-O0, run with Valgrind+vgdb, andbreak thread.c:5673 if line < 0, I seelineis-1which could cause the “invalid read of size 8” Valgrind error:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | |
I still didn’t make the final connection in my mind until later in the day:
…so maybe something has changed between 3.3.5 and 3.4.2 that causes rb_sourceline() to return 0 where it didn’t before?
But also this might be a red herring and not the cause of the memory corruption that happens later.
Finally I convinced myself that the memory corruption was caused by writing before the start of the line coverage array:
I believe I’ve confirmed that the attempt to access an array at index -1 is causing the later memory corruption. If I run
set line = 0at the above mentioned breakpoint onthread.c:5673 if line < 0, the program continues without aborting:…
However, if I just type continue without altering line, the program aborts with the munmap_chunk(): invalid pointer error:
…
I think this is probably enough to open a bug in upstream Ruby, but I might dig a little further into any changes to rb_sourceline() and its downstream methods.
…
For more confirmation, update_line_coverage() is writing before the start of the array. Definite memory corruption.
Apr. 7, 2025
This was the day I finally came to my senses and stopped trying to dig further on my own. I filed the bug with the Ruby team and worked on other things for the rest of the day.
As a wild guess I think it would take me a couple of weeks full time to track down the root cause of rb_sourceline() returning zero in Ruby 3.4 but not Ruby 3.3, so I have opened a bug upstream: https://bugs.ruby-lang.org/issues/21220
I got as far as looking at rb_vm_get_sourceline(), calc_lineno, and calc_pos but to go any further I’d have to get a deep understanding of the Ruby VM.
I included a proposed fix that stopped the memory corruption, but noted that I didn’t really think this was the root cause:
Something like this should prevent the memory corruption, but may be hiding a deeper issue:
1 2 | |
Apr. 8, 2025
Amazingly, two legendary Ruby core team members, byroot (Jean Boussier) and mame (Yusuke Endoh), reproduced my issue, found the true root cause, and came up with a good fix, all within a single day. You really gotta read the bug comments to appreciate this.
Here are the issues they identified and the fixes:
RUBY_EVENT_COVERAGE_LINEincompile.candprism_compile.cwas firing when line numbers were <= 0. Diff
1 2 | |
1 2 | |
- The PRISM compiler was generating a line number of zero for flip-flop operators, even though the PRISM parser had correct line numbers. This was logged separately as Ruby issue 21259 and fixed a few months later by one of the greatest Ruby legends, tenderlovemaking (Aaron Patterson). Diff
1 2 3 4 5 | |
This is also when I learned about my typo – before this point I had no idea
that I had missed those parentheses in midi_roll.rb:
@mbcodeandsound (Mike Bourgeous) Just FYI, I bet you meant to write !(1..16).cover?(channel) in the following line.
Thank goodness for us, because it resulted in the discovery of a bug in Ruby :-)
If I had written tests for that --channel check then I would have added the
missing parentheses, and never found these bugs in Ruby’s flip-flop operator
and coverage tracking code!
Flip-flop and RuboCop
Okay, as I asked in the intro, did you know that Ruby has an operator called
the flip-flop operator? I didn’t before this. It uses the same
double-dot or triple-dot syntax as Range creation, but only works within the
context of a conditional.
I could see the flip-flop operator being really useful for parsing semi-structured text or digging a range of events out of a server log, and enough people like it that it’s still in Ruby. But to me it seems pretty dangerous since it has the same syntax as Range creation.
RuboCop, a code “linting” tool for Ruby, does have a check for the flip-flop operator. I always use RuboCop on professional Ruby projects. It probably should be on mb-sound too, but if I’d added RuboCop to mb-sound, then I never would have found this Ruby bug, and you wouldn’t be reading this!
Many of RuboCop’s default rules are… suboptimal in my opinion, but I will most likely get around to adding it to all of my open source projects eventually.
Apr. 9, 2025
They merged the fix on the next day. This is an excellent turnaround time, and really added to my love of the Ruby language.
I’m sure they’ll never read this, but I have to say I genuinely appreciate the dedication and skill that the Ruby team members bring to the Ruby project. My brief interaction with them on my bug report was very pleasant and productive.
And now I can say I have my name in the commit history!

July 2025
While the memory corruption bug warranted a fast response, there was no need to rush the fix for line numbers on flip-flop operators. In the worst case scenario I can imagine, code coverage metrics might be slightly lower, or an error message might list line zero for one of the backtrace entries.
The fix for the PRISM compiler was committed on July 17 and backported to Ruby 3.4 on July 21.
Conclusion
So now you’ve seen how one simple typo in my Ruby code led to memory corruption through a series of small errors. The Ruby team handled the bug report exceptionally and all issues have been fixed and backported.
Some key takeaways:
- These bugs were only found because several small issues aligned, like in the
Swiss cheese model.
- Lack of error-path tests → missing parentheses → obscure flip-flop operator → compiler line number bug → memory corruption bug
- The open memory model used by C continues to bite us, but I still enjoy writing C, and tools like Valgrind go a long way toward mitigating the risks of pointer management.
- It’s really valuable to fix both the proximate cause and the root cause of an issue. You fix the proximate cause first just to get your project moving again, but if you don’t take the time to find and fix the root cause, it’s likely to surface again at the worst possible time.
- I should probably use RuboCop and/or other code quality tools on my open source projects like I do on my commercial projects. Rulesets can be customized to disable checks that get in the way more than they help. I could rant about how unhelpful linting rules affect software teams but maybe I’ll leave that for another post.
- When I run into a memory-related crash, I’ll probably start with Valgrind next time instead of using it last.
- It’s never the compiler’s fault. But sometimes, it is. So it’s worthwhile to have experience with the tools to debug your own code as well as your dependencies.
I hope this was an entertaining read! I’ve put almost as much time into writing this as I did into finding the Ruby bug in the first place.
Have a good one everyone, and keep having fun with Ruby!

