Often we are asked for advice on how to enhance the sound of speakers. This is not a straightforward question as numerous factors are in play when it comes to the fidelity of your sound. Things like speaker placement, room acoustics and using a sufficient power amp to run the speakers all affect how they will perform. While these do have an affect on the sound, ultimately it depends on how refined or detailed the source content is and the audio signal processing capabilities of your electronics.
So how does source content influence the sound quality? Let’s take an MP3 file for example. I’m sure you’ve heard MP3’s where the cymbals and hi-hats sound like nails on a chalkboard, the music lacks midrange fullness and random untamed bass from the sub. This is due to low quality file conversion to MP3 from the original audio files. Another common place that we hear poor audio quality is on YouTube. YouTube has a limit on how large of a file can be uploaded for streaming. Generally, both the audio and video quality must downgrade to fit the file size limit.
The first mass produced music format was vinyl, which gave way to cassettes and then CDs. These days, digital music files, that are either downloaded or streamed, are quickly becoming the main way people listen to music. When Napster appeared in 1999 and the era of digital music kicked off in earnest, it also meant we had to shrink the file size to upload and download faster. But shrinking the files came at the cost of denigrating the quality of the audio signal. I personally couldn’t stand the early MP3 sound quality, so I lugged around my double binder CD case for many years.
Audio quality doesn’t just start with your source content, it goes a few steps back to the recording process. For example, a sound that is poorly recorded by a microphone will have little to no chance of ever sounding good, no matter how much effects or EQ you apply. The recording process is a very tedious task and each step of the way the engineers need to make sure that the audio quality is preserved.
Here is a basic signal flow sequence to help understand how analog sound gets from the source, to your speakers. Let’s use the human voice for this example, which of course is a true analog source.
- Voice (singing or dialog) is recorded using a microphone.
- The microphone converts the analog signal using a transducer which turns the sound into electrical energy.
- That electrical energy is passed into a recording interface using a microphone cable.
- The recording interface will then analyze the incoming signal and digitizes the waveform which can then be further manipulated using recording software.
This is the first half of the process to recording analog sounds. At this point, you’re able to playback the recorded sound within the recording program, however it’s not been exported to a file that can be shared yet.
- Once all voices have been recorded, the engineer will go through the mixing process to blend the voices.
- After mixing, the engineer will need to export the file from the recording program, which is generally a WAV file.
- The export process is where you can specify the quality of the sound (there are some terms listed below that describe the details of what happens during the export).
Once the track is exported to a file you can import it into iTunes or share it on any other platform. Some media platforms have different import options, so if you want the best audio be sure to check your import settings and choose the highest quality import settings. During the import and export process, the computer will render the file to a format that is recognizable by other platforms and allow playback. Below are some of the terms that go into the process of exporting.
Audio Signal – This is an electronic representation of a soundwave. Typically, it is the recorded soundwave converted to either an analog or digital platform. Audio signals are measured by the decibel levels for each frequency of the soundwave.
Analog Signal – An analog signal is a constant stream of data. It continuously measures the form of voltage, current or charge changes. It is processed in real time and no fragment of the audio is left behind. Some analog sources would be a vinyl record, cassette tape, 8-track or a VHS tape.
Digital Signal – In its most basic form, a digital signal is coded using binary numbers. Literally, digital audio is using discrete values and must chop out tiny blips of the analog signal. The main digital sources are CD’s, DVD/Blu-Ray, MP3’s and streaming services.
Sample rate – The number of samples of audio carried per second, measured in Hz or kHz (one kHz being 1 000 Hz). For example, 44 100 samples per second can be expressed as either 44 100 Hz, or 44.1 kHz. Bandwidth is the difference between the highest and lowest frequencies carried in an audio stream.
Bit Rate – The number of bits per second that can be transmitted along a digital network. Bit rate is commonly measured in bits per second (bps), kilobits per second (Kbps), or megabits per second (Mbps). An MP3 that is compressed at 192 Kbps will have a better dynamic range and sound better than a file compressed at 128 Kbps.
24/96 – Usually refers to audio (music) discs that were created using the 2-channel DVD specification for audio (not the same as DVD-audio). 24 bits and 96,000Hz sampling rate. Provides a noticeable sonic improvement over the older CD audio specification. Most DVD players will play the 24/96 music discs.
DSP – Digital Signal Processing. Used to alter a digital input signal. Some common examples include: time delay for the rear speakers, equalization for a subwoofer, filtering low frequencies out of satellite speakers and adding “effects” (like “concert hall”.)
DAC – A Digital to Audio Converter. Converts a digital bitstream to an analog signal.
In the audio world we combine the power of both analog and digital signals. The more the signal gets compressed, or a reduction in sample or bit rate, the worse your audio file will sound. On the other hand, the higher quality audio file and the better your electronics and equipment, the better your audio quality will be.
Receivers include many different modes for listening to music and movies. If you’re more of a music listener, then you probably find that listening to music in 2-channel pure direct mode is best. Here are some other examples of the audio signal processing options in most receivers.
The most popular format for movies and surround sound is Dolby Surround. In this mode the receiver will automatically detect the number of speakers and process the audio signal according to its source content. If there are more than 5 speakers, then Dolby Surround will upmix the content to allow for realistic multi-channel effects. The source content could be in one of the following modes:
- Dolby Digital
- Dolby TrueHD
- Dolby Digital Plus
- Dolby Atmos (Creates audio imaging in a 3D sound field. Use additional ceiling speakers to obtain this effect. This mode is not supported in a setup which is less than a standard 5.1 channel speaker configuration.)
With the Marantz SR8012, if you choose to use DTS Surround as your sound mode, the receiver operates similarly with the audio signal processing. It will play back according to the source content in the following modes:
- DTS ES Dscrt6.1 (Enhances 360-degree expressiveness. This mode can be chosen when “Speaker Config” menu has Surround Back set to “None.”)
- DTS ES Mtrx6.1 (Rear channels are added to surround left/right. This mode can be chosen when “Speaker Config” menu has Surround Back set to “None.”)
- DTS 96/24
- DTS-HD
- DTS Express
- DTS:X (Creates audio imaging in a 360-degree field. Use additional ceiling speakers to obtain this effect)
- DTX Neutral:X (Creates audio imaging in a 360-degree field. Use additional ceiling speakers to obtain this effect)
When listening in Auro-3D sound mode, the receiver again detects the number of channels and determines the proper audio signal processing these modes:
- Auro-3D (Uses a decoder to create 3D audio using height channels)
- Auro-2D Surround (Uses a decoder to create 3D audio without height channels)
With Muli-Ch it allows playback for PCM or DSD sources. Again the receiver will determine the playback mode from these options:
- Multi-Ch Stereo (Stereo sound from all speakers)
- Virtual (Allows for expansive surround sound effects through just front left/right speakers)
Dolby Atmos – A 5-channel (or larger) system consisting of discrete left, center, right and left surrounds. In most cases this setup will have at least 7 speakers which typically include overhead speakers for the 3D effect. The most common setup for home theater is labelled as 7.1.4 (4 being the overhead speakers).
Dolby Digital – A five-channel system consisting of discrete left, center, right and left rear, right rear channels. It also has a separate subwoofer channel for the lowest frequencies known as LFE (low frequency effects).
Dolby Pro Logic – Rather than producing surround sound from 5+ discreet channels, as later surround sound formats like Dolby Digital do, the surround information is synthesized from a 2-channel source. Since it is often used as a default format (when a 2-channel source is sensed as the input) newer, improved versions are still being developed.
Dolby Surround – Older than Pro Logic, Dolby Surround has been superseded by later, better formats.
DTS Neo:X – an audio processing format in which a special chip that is normally built-into many 5.1 or 7. 1 channel home theater receivers can analyze all of the sonic cues of a non-encoded two-channel soundtrack mix (usually from an analog source), and distribute the sound into a home theater speaker setup.
Pure Direct – This mode is for playback with higher sound quality than in Direct playback mode. This mode turns off the main unit display and analog video circuit. Doing so suppresses noise sources that affect sound quality.
Each receiver has some, or all of these sound formats which add their own color or flavor to the sound. A “colored” sound characteristic adds something not in the original sound. This comes in the form of a slight change to the EQ of the sound. Coloration is sonically present but often not as accurate as the original signal that was produced during the recording process.
So there you have it, the beginners guide to audio signal processing. We recommend experimenting with the different sound formats to figure out which one sounds best for you.