Can Ruby do audio? Yes, yes it can. This video proves it.

Something you should know about me is that I like to learn about my tools by making them do things they aren’t supposed to do. In line with that, almost six years ago I asked a question (worded slightly differently): Can we use Ruby for sound? As far as I’m concerned, my most recent project continues to answer that question with a resounding, reverberating YES.

In December I live-coded a fully functional reverb effect using Ruby and the mb-sound library I developed for my YouTube video series, and since then I’ve been working nonstop on improving the reverb and making a video about it.



This latest video, inspired by a presentation at ADC21, uses data visualization and animation to explore the concept of reverberation, my implementation of the reverb design from that presentation, and the debugging process required to get the reverb working live. You can check out the video on YouTube, or continue reading here for some extra backstory and a complete video transcript.

Where we are

If you’ve been following my video series, you know my original goal was to recreate and then improve the stereo-to-surround-sound decoding processes developed in the 1970s, but that I got sidetracked playing with and presenting about synthesizers, filters, Fourier transforms, etc.

The Trello board with my video ideas
Well, the side quests are kind of turning into the main quest. While I still have the surround sound decoder code and a Trello board full of video ideas, I’ve really enjoyed using as much of my own hand-rolled code as possible to create music.

Also, beyond the surface layer of audio projects, I continue to develop and reinforce one of my career-defining principles – shape and structure are common across domains. The branching flows of an audio pipeline are similar in structure to a telemetry ingestion pipeline, a machine learning pipeline, etc. Once one is familiar with a few different domains, it becomes clear that there are underlying concepts and patterns (and I do not mean the classical “design patterns” here like factory, singleton, etc.) that span across disciplines, frameworks, and languages. I hope to write more about this in the future.

So, this video about reverb and the reverb code itself might as well be a lesson in engineering and problem solving in general. Exercising a muscle makes that muscle stronger regardless of the task. Pay attention to the parallels in your own domain as you read this transcript or watch the video.

One final note: by no means am I trying to say that DSP developers should drop C or C++ and switch to Ruby. What I ultimately want to show is that we can do amazing things when we carefully reconsider our preconceptions, when we impose new constraints, and when we approach a problem from a different direction. And that Ruby is a beautiful language.

Video transcript

Introduction

Hey everybody, welcome back! My name’s Mike. Today we’re looking at a tool for processing sound that is used in music, film, video games, and even consumer electronics. Today we’re looking at reverb.

We’ll look at the history of artificial reverb, a design based on a presentation at ADC21 that I recreated using Ruby, the code that I wrote for this reverb, and some of the issues that I ran into along the way. Finally we’ll listen to some demos of that reverb.

All right, let’s have some fun with reverb!

What is reverb?

Reverb is short for reverberation. In real world environments, sound bounces back and forth on all the surfaces around, giving each environment a unique acoustic signature. Think of a parking garage versus a dense forest. If you clap your hands in both places, both will have some level of reverberation, but both will sound very different. Similar contrasting reverberation happens in concert halls, practice rooms, stadiums, and so on.

Artificial reverberation

As people began recording and producing sound, they wanted a way to recreate the reverberation of different spaces without having to record on location. Thus was born artificial reverberation. Artificial reverb started with mimicking the reverberation of real spaces like recording rooms and concert halls, but as the technology evolved, people also wanted to create otherworldly, physically impossible forms of reverberation, even up to infinite reverb that never decays and never overloads.

So, when we talk about sound production, a “reverb” is anything we use to make sound reverberate artificially. This can be a physical object like a spring, a piece of hardware running a digital algorithm, a measurement of the reverberation of a real place (called an impulse response), or like the subject of this video, a program running on a computer.

Applications

A musician might use reverb to enhance a dull-sounding instrument or to accent a vocal. Movies use different reverb styles from scene to scene to create the sense of being in different environments. Video games also apply reverb in different amounts as you walk around different parts of the game world. Some A/V receivers even let you use reverb to recreate the concert hall experience at home.

Examples

Let’s listen to a few examples of reverb to get familiar with what it can do. First, some drums without reverb.

[Sound: dry drum loop]

Here are the same drums played through the impulse response of a mini plate reverb.

[Sound: drum loop convolved with plate impulse]

Here’s a realistic concert hall reverb made by a hardware reverb box.

[Sound: hardware hall reverb]

And here’s an over-the-top spacey reverb from the same reverb hardware.

[Sound: hardware space reverb]

Finally, here’s a sneak peek of the software reverb I made, also producing a concert hall sound.

[Sound: software drum hall reverb]

Design

Now let’s look at the design of our artificial reverb. We’ll also talk about tuning reverb to give different effects. I’ve linked to some great resources on different reverb algorithms in the description. We’ll be following the design given by Geraint Luff in a presentation at ADC21. I highly recommend watching that presentation for his explanation of his design.

There are two main stages of this design: the diffusion stage and the feedback stage. The diffusion stage blurs the sound over a short time interval and simulates the first reflection off the walls of a room.

[Sound: diffusion on a kick drum]

The feedback stage gives us the later reflections of reflections of reflections, and so on. It repeats and mixes the sound back into itself to create a decaying echo.

[Sound: feedback on a kick drum]

Both of these stages are made out of two very simple building blocks: delay and mixing. Let’s hear what a single delay sounds like. It just makes the sound arrive later, after a delay.

[Sound: voiceover delayed and panned]

Honestly it blew my mind when I saw how such simple building blocks could make such complex sounds.

Room reflections

The diffusion versus feedback breakdown is based on how reverberation actually works inside a room. Every surface bounces the sound back at us. Each bit of every surface is a little further away, so the sound takes longer to get there and back. So we can use the diffusion step to simulate the first reflections off the walls, furniture, ceiling, and floor.

After those first reflections, the sound keeps bouncing back and forth – reflections of reflections of reflections. The feedback step simulates these secondary bounces.

Diffusion

The diffusion stage uses a series of identical diffusion steps. I typically use four. Each diffusion step is just several delays in parallel followed by mixing.

The input sound is split into multiple copies (usually 4 or 8) and fed into the first diffuser. Each separate path for the sound is called a channel. Each channel is delayed by a random amount, and then the delayed channels are recombined. Every channel gets added to every other channel at equal volume, some with a polarity inversion, in a way that preserves all of the original sound.

Mathematically this mixer is called a Hadamard matrix, and the property of preserving all of the sound is called “all pass,” because all sound passes through without removing or adding frequencies. Uh, technically, without adding or removing volume to any particular frequency, but whatever.

Here’s what the diffusion stage sounds like by itself.

[Sound: diffusion on a drum sequence]

We have to pick some numbers to set up our diffusion. There’s the number of channels, the number of stages, and the range of delay times. How do we choose these parameters? More channels and more stages mean more dense blurring, but more CPU use. I adjusted channel and stage counts by ear for each different reverb preset I created.

To choose the delay time, we think again about how the sound bounces in a room. We’re using diffusion to simulate first reflections, so we pick the diffusion delay time based on the average distances and dimensions of our reflective surfaces. I use a rule of thumb of one millisecond per foot (or three milliseconds per meter).

Feedback

The feedback delay network stage has a similar structure to the diffusion stage. But unlike the diffusion stage, the delayed sound is reduced in volume, mixed together using a different technique called a Householder matrix, then added back to the feedback network’s input.

Let’s hear the feedback stage without diffusion.

[Sound: feedback on a drum sequence]

Feedback gets its name from its shape – we feed the output back into the input. The feedback produces a repeating echo effect. The mixing step blurs the echoes more and more as they repeat. The volume reduction controls how long the echoes last.

By the way, a change in volume is called gain. Gain is often measured in decibels. Positive gain is louder, negative gain is quieter.

Reverb feedback gain should always be negative. Here’s what happens if feedback gain goes even a little bit positive.

[Sound: overloading feedback]

Now we need to pick some important numbers again. The number of channels in the feedback stage will match the number of channels in our diffusion stage, but we still need to pick values for the delay time and feedback gain. The feedback gain controls how echoey the room is. Higher gain means harder surfaces and longer-lasting echoes, lower gain means softer surfaces and echoes that die down quickly.

Most of the time our feedback delay times should be close enough together that the diffused echoes overlap, giving a continuous decaying reverberation. But sometimes we deliberately space them out, such as in this stadium simulation.

[Sound: voiceover through stadium reverb preset]

Recall that we used the diffusion step to mimic the first bounce of a sound off the walls and surfaces in a room, and that the feedback step simulates all the later reflections. It kind of makes sense, then, to use a feedback delay based on the dimensions of the room itself, again using the rule of thumb of one millisecond per foot.

Choosing parameters

To finish our reverb, we connect the diffusion and feedback steps together and mix the original signal with the result. The original signal is called “dry,” and the effected signal is called “wet.” Don’t ask me why, I don’t know. We can adjust the wet and dry levels based on how close we are to the source of the sound. We can also delay the reverb relative to the original sound. This is called predelay. In our physical analogy, predelay is like the difference between the direct and reflected distances to our drums.

Here’s an example. More dry, less wet, and more predelay make it sound like we’re close to the drums in the middle of a giant room.

[Sound: near drums]

More wet, less dry, and less predelay make it sound like the drums are far away from us.

[Sound: far drums]

So now, with the diffusion, feedback, and original signal all put together in the right amounts, we get a really amazing result.

[Sound: music with reverb applied to drums and electric bass]

Implementation

Let’s look at the code. You can follow along using the source code link in the video description. My implementation uses the Ruby programming language. I used my open-source sound library mb-sound, which provides a DSL (domain-specific language) for building audio processing pipelines. This is one of the things I love about Ruby by the way – Ruby syntax makes it super easy to create an expressive interface for any task. I can chain method calls together to build the reverb algorithm from simple building blocks in a very readable way.

Diffusion

First we’ll look at the diffusion step. We need a function to generate a single diffusion step, and then we’ll call that function in a loop to create our group of diffusers.

  • First we build a series of progressively increasing delays. One channel always gets a zero delay time to prevent unwanted predelay. The delay times are shuffled just for kicks.
  • Next we create the list of delay nodes reading from each channel. We randomly decide whether to invert the output of each delay. The mb-sound DSL allows us to create the delay using a single function call.

    This delay uses a circular buffer with a variable read offset and fractional indexing. That means we can read past the end of the buffer and we loop back to the beginning, and we can read between individual samples. The circular buffer makes delay easy because we can just set a read offset relative to the write pointer in the buffer, and fractional indexing lets us modulate our delay times smoothly.

  • Here we feed the delays into our Hadamard matrix to mix them together.

    I wrote a new class that uses a matrix to mix channels together. Each entry in the matrix controls how much of each input is added to each output.

  • Finally we shuffle the outputs to prevent the same channel stacking up in polarity from step to step.
  • In our outer loop, we connect the output of one stage to the input of the next.

So that’s the code for the diffusion step.

Feedback

Now let’s look at the feedback network:

  • In the constructor we set aside buffers to store our feedback signal.
  • We start the feedback step by mixing the input signal with the feedback signal, multiplied by the feedback gain.
  • Next we mix the channels together using our Householder reflection matrix.
  • Then we add the delays to the outputs of the mixing matrix.
  • And at last we describe the feedback path for the visualization (my code does not yet have a built-in way to create feedback, so this annotation lets us draw the arrows pointing the other way).

In the end we mix the dry and wet signals and assign our internal delay channels to output channels.

Extensibility

Since our reverb is broken down into stages, and each stage is made from simple building blocks, we can easily add more features over time. The reference presentation has several suggestions for improvements, such as delay time modulation, pitch shifting, filtering, and so on. But for now, this design is clean, simple, and effective. And that’s all it takes to produce a decent, basic, and easy to modify reverb!

Debugging and setbacks

Of course, programming is rarely as straightforward as just writing each line one at a time and then magically the program works.

Man, chasing ghosts!

Weird, weird, weird.

Here goes probably something that’s going to break. Haha, yep!

I don’t think that sounds right.

Oh, come on.

During the livestream I had to spend over an hour debugging audio glitches and other problems before diffusion was working. Some of these were mistakes in the reverb code, and some were actually bugs in unrelated code. Here are some of the issues I encountered, how they affected the sound, and how I found and fixed each one.

Normally when debugging something I walk away for a minute or two and come back with better ideas, but this was a live stream so I had to press forward, so just remember that there are often better ways to solve these problems if you do have time to take a break.

Glitches

First, the sound was very choppy and running four times faster than it should.

[Sound: glitchy buzzing sound]

Haha, okay, that’s definitely wrong.

I tried different parameters, I added printouts and set breakpoints inside the code, enabled and disabled different parts of the code, and tried different audio sources.

It turned out that the glitching was caused by a quirk of my architecture, where there is no global clock. Here’s how it works, but I would probably design this differently if I started over from scratch.

Architectural overview

I process sound by combining simple building blocks that all have a function called sample. This function basically asks a block to “give us the next slice of audio.” Each building block knows which other building blocks are its inputs.

When we call sample on a filter, for example, the filter grabs a chunk of audio from its input, processes that chunk, and returns the result. Every block needs to be processed exactly once to produce the final chunk of audio we hear, and we need to do this 60 times per second for 60 frame per second video.

[Sound: short clicks of audio gradually getting faster until they coalesce into a filtered sawtooth wave]

Branches and buffers

With this architecture strange things happen if multiple blocks try to read from the same source. Everything starts normally. The first downstream block will call sample on the upstream block. The upstream block then grabs its next chunk of audio from its source and so on. So far so good.

But when the second downstream block calls sample on the shared block, the shared block doesn’t know that we’re still supposed to be working on the first chunk of audio. So, once again it retrieves the next block from its source, processes it, etc.

The upstream block runs faster than it should, and slices of audio from different points in time go down different paths that were all supposed to receive the same slice of audio. The result? Glitchy audio.

[Sound: glitchy buzzing transitioning to a smooth sine wave when the buffer is added]

So whenever an output is shared by multiple blocks, we have to call a function that creates a buffer in between that can keep track of where each downstream is supposed to be.

State tracking

I also made a mistake in the matrix mixer class that caused the same kind of glitch.

The matrix mixer that I’m working on is causing the audio to glitch. It’s somehow handling the buffers incorrectly.

That’s not supposed to happen.

Again because of my architecture, any block with multiple outputs has to keep track of time by watching when those outputs are read by the downstream blocks. My code advances the clock to the next tick when an output gets read twice, and warns if other outputs weren’t read during that clock tick.

Ah, I think I know what’s going on here!

The mistake I made here was not recording when an output was read, so the fix was pretty simple.

Missing delay

After that I wasn’t getting any audible diffusion.

*[Sound: drums without audible delay or echo]

Huh. Yeah, that’s just the dry sound – there’s no echo, reverb, whatever being applied.

This was another mistake I made: I was passing delay times in milliseconds to an API that wanted samples.

Ah, I know what I’m doing, haha!

I changed the delay time to seconds (just divide by 1000) and used an API that accepts seconds to fix it.

[Sound: drums with multiple delayed copies]

There we go!

Half-speed sound

Then there was an issue with sounds running at half speed, with half of the audio cut out so the overall tempo stayed the same.

[Sound: very low-pitched kick drum with a choppy rhythm]

Whoa, that’s interesting.

I still haven’t figured out the root cause of this one, but it only happens when I add a sample rate changing node directly after a file input.

[Sound: normal drums]

So when I don’t add the resampler it works.

When I add the resampler…

[Sound: glitchy drums]

it’s actually running at half speed. That’s so weird.

To fix the slow audio issue I had to remove the resampler and only try to change sample rates if they don’t match. At least I could save some of the glitchy drums as loops for future music.

Distorted audio

At long last I had normal audio running through the diffusion step. Here I ran into problems with everything coming out way too loud and distorting.

[Sound: distorted drums]

I think I need to be reducing the volume.

Since I was coding this live I just threw a few guesses at what the right gain correction should be…

Just arbitrarily add a four.

[Sound: normal drums]

That seems fine.

…but at some point I need to analyze each step of the chain and calculate the true gain for the number of diffusion steps and channels.

Mutable memory

Now each time my audio looped through the reverb it got quieter, meaning that the audio in memory was getting modified.

Ah, weird. Weird, weird, weird. Okay, so the volume is going down each time it plays through. It’s like modifying the input buffer.

So when the flow branches, we have to copy the buffer instead of sharing it.

Householder matrix

There was still one more demon lurking in the code after this. Even though the reverb sounded decent, the feedback mixing step (the Householder matrix) wasn’t working at all! I only found and fixed this after the stream.

Don’t give up

Luckily, after finally fixing all these issues with the diffuser, coding the feedback delay network went a lot faster, and I spent the rest of the time tuning parameters to get a good reverb sound. And only then, after hours of debugging and tweaking, did I manage to complete our reverb from scratch. As they say, after hours of trying, nailed it on the first try.

So remember, when doing something for the first time, it often takes multiple tries to get it right. So don’t give up!

Demos

Finally! We did it! It’s been a long journey to get this far. We learned what reverb is, faced some challenging bugs in audio code, and ultimately succeeded. Now we get to enjoy some sweet, juicy reverb. So just enjoy these demos of what our brand new reverb code can do and I’ll catch back up with you afterward.

  • Saw wave synth
    • Dry
    • Hall reverb
  • Drums
    • Dry
    • Near (hall)
    • Far (hall)
    • Dry
    • Near (room)
    • Far (room)
  • Bell
    • Dry
    • Hall reverb
  • Bell pad
    • Dry
    • Space reverb
  • Synth chords
    • Multichannel
  • Music
    • Dry
    • Mixed reverb

Conclusion

I hope you’ve had fun learning about reverb with me! How do you use reverb? Any tips or tricks you’d like to share? Or do you have any questions about the video or things you’d like me to cover in the future? Let us all know down in the comments.

Thank-you for watching to the end, and have a good one!

Bloopers

You’ll have to watch the video for these!