Once again it’s time for another episode of my video series, Code, Sound & Surround! This series is all about using 2D and 3D visualizations, code, and sound to explore the relationships between math, sound, visuals, and problem solving from a unique angle. My goal is not necessarily to be a one-stop source to learn these concepts, but rather to both provide an overview of things you might not have heard of, and to present familiar concepts in an unfamiliar way, to help unlock new insights and reveal the connections between ideas.
This time we’re reviewing how sound is digitized, quickly recapping filters, and looking at the Fast Fourier Transform (FFT) with some never-before-seen 3D visualizations. The video also briefly introduces analytic signals and sets the stage for future videos that will dive further into analytic signals, synthesizers, and surround sound. And don’t miss the synthesizer demo song at the end, showing just how much can be accomplished with a few stages of simple signal processing.
As part of working on this video, I’ve made hundreds of commits and added thousands of lines of code to mb-math, mb-util, and mb-sound, so check those out as well. Also, all music from all of my videos is original, either composed for this video series, or repurposed from my previous work.
Keep reading below for the video’s backstory, a list of the concepts covered in the video, and a transcript of the video.
From the beginning of this video series, my end goal was to demonstrate a unique modernized reimplementation of 1970s-style surround sound decoders, in a way that gives viewers the knowledge and confidence to grow their own understanding and capabilities in solving their own classes of problems. Basically, I made something I think is cool and useful, and want to help others make cool and useful things too.
I originally expected the whole series to be done in a few months. But with every video, I find that either I want to explain something in a bit more detail, or I accidentally stumble on a new way to visualize something that I haven’t seen done before. Thus each video on its own has taken three to four months of research, writing, coding, filming, and editing.
This latest video was no exception. I started with a script about quadrophonic sound systems of the 1970s. Then I upgraded my synthesizer from episode 2 (about synthesis, filters, impulse response, and frequency response) to help demonstrate quadrophonic sound, and wanted to talk about that. So I wrote another script about the synthesizer. But in the synthesizer I show analytic signals and FFTs in a way I haven’t seen before either, so I felt like I had to recap digital sampling and FFTs first. And thus plans for one video became three.
Concepts for this video
The first few minutes of the video use a mix of clips from episode 1 (about the basics of sound) and lots of new animations to introduce digital sampling, sample rates, and the Nyquist frequency. I also revisit the ability to break a sound down into component sine waves and the circular nature of waves, combining those two concepts to motivate the idea of analytic signals.
Fast Fourier Transform
Next is a wonderfully animated tour of the FFT. The FFT/Fast Fourier Transform is an algorithm for calculating the Discrete Fourier Transform in a more efficient way. The DFT is naively O(N²), but with an FFT it can be computed in O(N log N). I show the FFT as a sum of complex sinusoids broken apart in 3D, and explain how the FFT’s output can be used as coefficients in the equation for either real or complex sinusoids.
The FFT gives us a really straightforward way of turning any sound into a spiral form called an analytic signal, which is especially useful for calculating instantaneous amplitude and phase — something we’ll want to do in a surround sound decoder — and for rotating phase arbitrarily.
The last concept from this video is just a recap of filters, showing the effects of a low-pass and high-pass filter on the video’s audio. The music for this section (apart from the drums) was created using the new synthesizer code.
Although I originally planned to dive into the details of my new synthesizer work (mentioned above), I only had time to provide a very quick introduction and a demo song that shows some of the new synthesizer sounds and unique 3D visualizations.
Hey everybody, welcome back! Mike here. I’m really excited for today’s video, because we’re returning from our tangent into graphics, and getting back into sound.
We’re going to review digital sampling, take a look at the Fast Fourier Transform and analytic signals, and then briefly recap filters (which we covered in episode 2). Then we’ll take a sneak peek at the modifications I’ve made to the synthesizer from episode 2, which we’ll dive into in the next video.
So all these concepts will be used in the next video to explain how the synthesizer works, and all of this is in preparation for later videos. So enjoy this one! Let’s go!
So here’s what we’re going to cover in this video. Feel free to pause, rewind, and rewatch as needed.
The first thing to remember for this video is that sound travels in waves – that is, some quantity going up and down in a repeating pattern. Specifically, sound in air is made by air pressure going up and down over time.
We can measure the air pressure using a microphone, which converts changes in air pressure into changes in voltage carried by a wire.
[slow motion handclap]
The voltage change is analogous to the original pressure change, so we call this “analog” sound (from the same root as analogy).
We can measure the voltage tens of thousands of times per second, giving us a list of numbers representing the voltage at each measured point in time. This is called digitizing. Digitization is the process of turning something into a list of numbers representing separate measurements.
The number of measurements made per second is called the sample rate. We can record sounds with frequencies up to one half of our sample rate, which is called the Nyquist frequency. So if we record 48000 samples per second, our sound can contain frequencies up to 24000 Hz. You may recall that the limit of human hearing is about 20kHz, so 48kHz sampling is plenty for most of our needs.
Digital sound can then be converted back to analog by generating the voltages described by the list of recorded numbers. Those voltages are then amplified and sent to speakers, which convert electricity into movement, which in turn creates changes in air pressure, and those changes in air pressure reproduce the original sound.
Digital sound is incredibly useful to us because we can use math to transform the sound virtually limitlessly. For a simple example, we can multiply every sample by a constant value to change volume; either louder, or quieter.
Most streaming services transmit two separate streams of numbers for sound, left and right. If a sound is played at the same volume in both left and right channels, it seems to come from the middle. Varying the relative volume of the two channels makes the sound seem to move back and forth between the left and right speakers.
Sending two separate channels to be played by two speakers is called stereo sound. Sending just one channel is called mono sound.
The next thing to review is the circular nature of waves. We saw in episode one that sine waves are really just one dimension of circular motion in two dimensions.
A circular sine wave can be shifted earlier or later in time (or clockwise or counterclockwise in rotation). This is called phase, and is measured as an angle of rotation.
We saw in the second episode that more complicated waves can be created by adding a whole bunch of sine waves. Put these ideas together, and we can represent any complicated sound using combined circular motion.
There are lots of great videos from others about digital sound, so I’ve created a playlist with some of those videos, a link to which is in the description. Feel free to mention more good videos and channels about sound in the comments.
Fast Fourier Transform
The mathematical tool we use for turning sound into an equivalent combination of simple sine waves is the Fast Fourier Transform, or FFT. The FFT takes a list of amplitudes recorded at regularly spaced points in time, and returns a list of amplitudes at regularly spaced frequencies.
We can split a sound into overlapping slices of, say, 800 samples, then calculate an FFT for each slice to see how the sound evolves over time. These slices are called windows.
So once again, for sound waves the FFT gives us a discrete, finite list of sine waves that make up the sound. There are positive frequencies on the right, and negative frequencies on the left. In the middle is the zero frequency, sometimes called the DC or direct current component.
Let’s slow this down and take a closer look.
[music slows down dramatically]
All frequencies in an FFT are integer multiples of some base frequency. Each frequency is represented by a 2D vector with an X and a Y, or equivalently a complex number, with a real and an imaginary part. The length of this vector gives the amplitude of the corresponding wave, and the rotation gives the phase. These numbers are coefficients in the equation for a sine wave.
Frequencies that don’t fall on one of these exact frequency samples get split between the closest neighboring samples, with changes in phase. These frequency samples are sometimes called “bins,” because all frequencies get sorted into these separate frequency bins as if they were containers for a range of frequencies.
For sound waves, the positive and negative frequencies are a mirror image of each other.
But wait a minute, how can frequency be zero or negative? How can you do something negative 60 times per second? Remember that each wave is actually not just a flat sine wave but motion around a circle. Positive frequencies represent counterclockwise rotation, and negative frequencies represent clockwise rotation. The zero frequency represents no rotation, and is just a constant shift.
The complex number form of each frequency serves as the coefficient in the circularized wave equation.
When all these circular waves are added back together, the clockwise and counterclockwise rotations cancel out to give us the original flat, one dimensional wave.
If we ignore the clockwise negative frequencies and add up the positive frequencies, we get our original sound wave’s equivalent circular-ish motion. Notice that the previously flat waveform turns into a spiral as we remove the negative frequencies. Every sound can be represented in this circular, spiral form by removing its negative frequencies.
[music speeds up to its original speed and style]
Any wave in this form, with no negative frequencies, is called an Analytic Signal. We’ll explore the analytic signal form of sound and why it’s useful in the next video.
So why have negative frequencies at all? Because the FFT is used for 2D motion as well, not just 1D sound.
You can find more videos about the Fourier transform in the playlist linked below, and feel free to ask questions and discuss in the comments.
Finally, in episode two we also looked at filters, which are functions that can change the amplitude and phase of the different component frequencies that make up a sound.
[high frequencies are removed from the sound leaving only the bass]
Removing the high frequencies makes a sound seem muffled…
[low frequencies are removed leaving only the treble]
…and removing the low frequencies makes a sound seem thin and metallic.
[audio is restored to its full frequency range]
The bass and treble knobs on an old stereo are examples of filters.
We’ll make more use of filters in this and future videos.
So those are our basic concepts for this video.
All right, having reviewed those concepts, let’s take a look at the synthesizer that we’re going to dive into further in the next video.
[synthesizer parameters changing and bass brass notes played]
We’ll use the synthesizer to explore concepts like envelopes, waveshapers, and filters.
I’ve assigned camera changes to keyboard buttons, so I can take you on a tour through the synthesizer features in the next video.
For now, let’s dive into a demo song.
[synthesizer demo song with synthesized pads, bass, and saxophone]
So I hope you enjoyed that demo, and the review of digital sampling, the Discrete Fourier Transform, and filters. We’ll dive into that synthesizer in the next video.
So subscribe so you don’t miss future videos, and let me know in the comments if you have any suggestions or questions.
Thanks for watching, and have a good one!
[humorously] If it’s not good enough, too bad. It’s what you get. This is the video. It’s done.