Have you ever wondered how a stereo setup actually reproduces music?
"What exactly am I listening to?"
The answer might surprise you.
So, let's take the journey from live performance, to microphone, to mixing and mastering, to distribution, to your system and finally to your speakers and find out what it takes to reproduce the realistic sounding music we all love.
What is sound?
Sound is small fluctuations in air pressure. Since changes in pressure cause motion, the sound breaks away from its source and travels through the air. This is sometimes depicted as waves. But the complexity of it is more like waves on a beach than the simple waves you see when you toss a pebble into a puddle.
As the sound radiates out it causes the air pressure at your ear to fluctuate which moves your ear drums. They signal your brain, which in turn interprets the signal as sound.
At its most basic level, it's really simple:
The amount of pressure change translates to loudness, big changes are louder than small ones.
The rate of change translates to pitch, the more often the pressure changes, the higher the pitch (frequency) of the sound.
What is music?
When you are listening to a live performance of natural instruments what you are hearing is the specific voice of each instrument creating its own waveform that eventually reaches your ears.
The pattern of these waves is complex enough that things sound different to us. A bell does not sound like a guitar string. We remember these patterns subconsciously so that by the time we reach maturity we will have a very high success rate at identifying various sounds, particularly voices and musical instruments.
In your ear, the resulting pressure changes are summed into one very complex motion of your eardrum that you will hopefully interpret as good music. Thus, music is the sum of complex sounds from individual instruments.
Capturing sound
In order to capture a musical performance in some reproducible way we need to "hear" it and store it. This is where we use microphones.
A microphone is a type of transducer. Transducers change energy from one form to another, in this case from sound waves to electrical signals.
When a sound wave passes a microphone, the changing air pressure will cause its diaphragm to vibrate in much the same way as your ear drum. The microphone's internal mechanism changes this vibration to a continuously updated electrical voltage that fluctuates in step with the air pressure. The output from the microphone is thus an electrical analog of the motion of your ear drums.
From this point forward, the performance is not sound. Now it is a group of rather complex electrical signals, one per microphone, that vary in voltage over time.
Tracking
Now that we have our signals, we need the means to store them. This is typically done by first passing them through a recording console that adjusts the levels. The adjusted signals are then sent to a multi-track device for storage.
In the early days of recording, these storage systems were using magnetic tape. The signals were stored as tiny changes in magnetic flux on the moving tape in a pattern that again mimics the changes in air pressure heard by the microphones. But now, with everything going digital, the storage device is most often a computer. In this case the electrical signals from each microphone are converted to a series of digital samples representing the level of the signal over a very short period of time (typically 96,000 times per second). These samples are then stored in data files that are not unlike any other computer data.
Mixing
The next step along the road is taking the stored track information and massaging it into something more compatible with home entertainment systems. This can include considerable modification of the recorded data itself.
The position, loudness and tonality for each track is set into the recording at this point. For stereo this means the sound engineer will pan some tracks to the left, some to the right and some are kept at center. Each track can be equalized, compressed, limited, pitch and time corrected and other tricks can be used, hoping to give a better result on a home system.
Mastering
Next, the individual tracks are merged into channels. The overall tonality, level and balance of the recording is set. Whether a song is shrill, shouty or bassy is again completely under the control of the recording engineers.
Once they have a completed channel based version of what they want you to hear, the final mixdown and master recording is ready for distribution. There will be one electrical signal per channel. Like this:
This is also when the special requirements for various distribution media are set into place. For vinyl records they will apply RIAA equalization, for tape a pre-emphasis curve is used. Digital media generally does not require special care.
Distribution
At this point our recording consists of one signal per channel. Each is a single highly complex waveform that changes over time. Now the problem is to distribute this to the masses for their enjoyment.
The final product is mass produced and readied for its fans. Distribution can take place in any number of forms. Tape, Vinyl, CDs and Digital files have all been popular in their times. Now we can even stream audio from the internet.
Playback
When you play a recording, your system's first job is that of decoding and recovering the recorded waveforms. This can be done either by mechanical means as with vinyl records, magnetic means for tape recordings, optical means for CDs and DVDs or by mathematical means for computer files and streams.
The the goal is to recover and then amplify these recorded waveforms as accurately as possible.
Amplifying the signals
Now we get to the reason that an entire musical performance has been reduced to a one per channel waveform. It is a basic law of physics that any conductor (wire, circuit, etc.) can only have one voltage present at any given point in time. You can't pump twenty separate instruments through an amplifier channel, but you can push through a signal that represents the sum of all their sounds.
It is now up to your electronics to amplify this one signal per channel to a level that can drive your speakers to reasonable volumes.
Making sound
Up to now we have been following an electrical signal. But we can't hear electrical signals, so we need the means to convert those signals into something we can hear. This is where our speakers come in.
A speaker is another type of transducer. It changes electrical signals into sound waves. It is basically a linear positioning servomechanism. With a cone or dome attached, it can push air and produce changes in air pressure.
When you apply a voltage with enough current, a speaker will move its cone in or out by a distance dictated by the voltage applied. Changing the voltage will change the position of the cone. If we do this rapidly enough we can move enough air to make sound waves. If we do it accurately enough we can reproduce highly complex sound waves like those in the waveforms in our recordings. This in turn causes your eardrums to vibrate in a way not greatly unlike they would when listening to real music.
In the end, the waveforms recorded with such care are really just a map of how to move the speaker cone. Your amplifier and other equipment simply gives it enough power and then feeds it to your speakers which convert it to sound.
Now, for the first time in our journey through all these steps there is something you can actually hear.
What your system can't do
Zooming in on one channel of the song above so we can see a fraction of a second of the detailed waveform, we get this:
It is important to note that these signals represent the sum of all the sounds in their respective channels. Aside from manipulating tone and volume which are applied to the entire channel, your home system can't change the content of these signals. Editing things like soundstage or the sound of an individual instrument is simply not possible in real time.
The truth is that your home system actually has no idea what it's amplifying. It's only job is to reproduce the recorded waveform as accurately as possible.
Summing up
As we established early on, music is the sum of several discrete sounds originating from multiple instruments. But, that is not exactly what is happening in your home stereo system. What you hear is the result of electrical waveforms being applied to a speaker.
The good news is that when all this works right, you get a really enjoyable approximation of your favorite music.
Finally, if you are curious about the source for the music waveforms in this article, I had a little fun with AI and it came up with a song about a watermellon and a banana that fell in love. You can download Fruitful Love in a zip file.