Music & Audio

Preparations for Mastering Audio

Composers often ask what to do before sending a track off to a professional mastering facility. Many mastering studios now make this information public in order to answer some of the most frequently asked questions. Unfortunately, in order to keep these explanations relatively simple, many corners are cut. This has lead to many misconceptions that are echoed across the internet. On this page I will attempt to explain things a bit more, hopefully killing some of the myths surrounding mastering in general.

Completeness, track order

First you have to tell the mastering studio what media you're releasing: Is it a CD, an LP or something else? Additionally, when sending several tracks to mastering for a CD, an LP or an online album of mp3 files, the track order should be known, and all tracks must be final and ready for mastering. This is because deciding on how tracks are supposed to fade in and out, finding the correct pause between the tracks etc. is an essential part of the mastering process.

Disable your own mastering effects

It's quite common and popular to have a few effects added to the master signal chain in order to spice things up a bit. It's both fun and inspirational to see what happens to your music when equalizers, compression and limiters are applied. These are all essential parts of the mastering process. Because of this, mastering studios (well, proper mastering studios, that is) make sure to have extremely high quality EQs, compressors and limiters - and to have people who are experts at using them. They will do this much better than your own VST plugins could ever do, and that's why mastering studios exist in the first place. So in order to get the most out of the mastering studio, make sure to disable anything you have on the master signal chain, unless it's something very unique (something they don't have) and something that is essential to the track.

with mastering FX

You have to remove the stuff on the master signal path, so that you get this:

Now, this might change the sound a lot in some cases. If you are afraid the mastering studio will not know what the track is really supposed to sound like, you can probably give the mastering people an additional mp3 of your own mastering (besides the actual .wav file), and say "this is what I had in mind", and they'll know how to do the same thing - just in a much better quality. This is what they do for a living after all, so trust them.

How to mix

So how should you mix things, if you are not supposed to process the master signal? I would generally recommend aiming for a mix that sounds as good as possible without the mastering effects applied. With a good mix your chances of success are much bigger. Without mastering effects applied, it will be easier to tell if the various instrument volumes are correct or not - and the volume knob is still the most important mixing parameter. Following the general guidelines on how to make a good mix are recommended. Some people like to use meters and, while this can certainly be helpful, trusting your ears is still more important.

Louder is not better

Many musicians are afraid their music is not "loud enough". You should keep in mind that even though consumers generally don't fiddle with any settings, they do use the volume knob. If the music is too loud or annoying, they will turn down the volume. If it's too low, they'll want more and turn it up. If you hope to sell more records by making it louder, you're simply joining the loudness war. You will be sacrificing a lot of quality in the attempt to boost sales. Whether this actually helps or not, I shall leave to marketing experts. This page is about sound quality, and with that in mind I can only say: The loudness war is like all other wars: There are no winners, only casualties.

So before sending something off to mastering, you have a painful but important decision to make: Do you want good or loud? You simply cannot have both.

What does more bits do?

Now 16-bits will give you a signal-to-noise ratio of appx. 93 dB, which is usually enough for any situation. But because further processing is done in the mastering process, the noise introduced might be boosted enough to become audible in rare cases. If you can, use 24-bits. If it's not possible, you will be allright in most situations unless the mastering studio need to add a lot of compression and extra treble. Many good records have been sucessfully mastered from 16-bit sources after all.

Never normalize 16-bit files. This will never improve sound quality, but rather add a tiny amount of noise. If you want to deliver 16-bit material, either normalize it while it is in 24-bit resolution or don't normalize at all. The mastering studio will adjust the individual level of all the tracks anyway, so in most cases, normalization is simply a waste of time. It's much better to make sure the output volume is correct when rendering out the mixdown.

Output volume

Render your track as loud as possible without ever hitting max (0 dB). The louder you can render without clipping, the better. If you do hit 0 dB (peak / overload), stop and try again at a lower volume. Clipped peaks are not okay, and this is so important for the mastering studios, that they'll rather recommend giving them tracks that are 2-3 dB below max than just getting too close. Especially considering that some software doesn't show peaks in a very reliable way. The idea is "better safe than sorry".

Now, if you actually like the sound of clipping the master signal, it's better to tell the mastering studio and/or show them examples, and have them do the clipping. Most likely they have some equipment that can clip in a much more awesome way than plain digital clipping.

Too low:

Good:

Too loud (spikes are clipping):

What does a higher sample rate do?

The theory books tell us that aliasing can be avoided if the Nyquist frequency is greater than the bandwidth of the sampled signal. This rule also applies when resampling to a lower sample rate. Because of this limitation we need to filter or interpolate away any frequencies above half of the resulting sample rate. There is no perfect solution on how to design this lowpass filter. Any kind of lowpass filter introduces various side effects, such as "ripples" in the frequency response, ringing or phase shifts. The steeper the filter, the worse the side effects. Typical real-life solutions try to make a sane compromise between how steep the filter can be, and how many side effects are acceptable.

Example: When sampling at 44100 Hz, a rather steep lowpass filter must be applied below the Nyquist frequency (22050 Hz.) Because the roll-off is not infinitely steep, a realistic lowpass frequency could be something like 20000-21000 Hz. A slight phase delay will be introduced around this frequency, but as the upper audible limit of an adult person with good hearing is something like 15-18 Khz, the result is acceptable.

Now, throughout the entire process of creating a piece of music, the signal is first recorded (which involves such a low-pass filter) and then it is typically processed numerous times when various effects are applied to the instruments. Many of these effects involve one or more resamplers, each applying yet another lowpass filter if downsampling to a lower number of output samples. And even though passing through such a lowpass filter once is not audible, passing through multiple is. The small errors introduced in each step add up to become a rather large error. To make things worse, many digital algorithms (such as equalizers) tend to misbehave a little bit in the upper or or two octaves. Again, by doubling the sample rate, the problem is moved up an octave, making it less audible.

The trick is to never have any bottlenecks in terms of sample rate. Record, process and output at a high rate (96000 hz or higher), and all the well-known drawbacks of digital audio will be much less pronounced. It starts behaving and sounding more like analog sound.

graph

By increasing the sample rate, the introduced muffling and phase shifting of the treble end is almost pushed out of the audible range.

graph

Now don't be alarmed that any simple copying of audio will cause the quality to deteriorate. This degradation happens only with certain kinds of effects and software synthesizers. Here's a list of types of effects that often contains resamplers or interpolations that are sensitive to what sample rate is used:

Chorus, reverb, equalizers, anything that changes pitch (vibrato, pitch shifters, auto-tune), certain flangers, certain phasers, lowpass filters and highpass filters (especially when setting the frequency very high), drum machines, FM synthesizers, sample players and subtractive synthesizers (to various degrees.)

Finally I will demonstrate the difference by taking a short piece of a classical music in three different formats: mp3 (192 kbit/sec), -> .wav 44100 Hz -> .wav 96000 Hz.

Then I slowed them down to half speed, so that the differences between the formats become easier to hear. This demo also shows that there really can be meaningful audio content above 20 KHz, and not just noise - even on an LP from 1975. No denoising or audio restoration have been applied. Listen to the demo here

File formats

Refer to the specific mastering facility for their preferences. For .wav and .aiff files, 24-bit is usually the better choice. Despite the hype, 32-bit float files do not offer considerably better quality, and may even introduce other practical problems when exchanged between different computer systems:

Floating point can be stored in different ways that may or may not be compatible. The technical benefit of 32-bit float over 24-bit is mainly that it is capable of reproducing audio above 0 dB. In other words, you will never have clipping with float! The signal-to-noise ratio is 24-bits for both. The ability to handle peaks above 0 dB is very useful within an audio application. However, having a common reference point for what the maximum is, serves as a very convenient standard, and prevents a situation where you never know what the max expected volume of a foreign .wav file is. A good example is that winamp will normalize float .wav files if they go beyond 0 dB. (A float .wav file could theoretically be +100 dB or more.)

Don't use an mp3 file or a similar compressed (lossy) format for mastering. The things they have to do will boost the nasty artifacts caused by the compression, and the result will be horrible.

Behind the Scenes

So what happens once they receive your material? Here's an attempt to show a typical mastering process, even though there are no rules. The actual process is decided depending on the situation at hand.

Your files are loaded into the DAW / mastering software.
The files are arranged in their designated track order.
Listening to the songs, getting to know the material.
Editing away mistakes (unwanted clicks, missing audio and such.)
Removing various kinds of noise or hum, if that's a problem.
Checking up / adjusting stereo width.
In rare cases adding reverb (especially for acoustic music recorded the classic way.)
EQing the tracks so that they fit each other.
De-essing vocals if the S and T-sounds are too harsh.
Adjusting the volume of the tracks so that they fit together in mood (this is why normalization is not really important.)
Compression or perhaps multi-band compression.
Limiter or saturation or a combination of these.

Output media specifics

The above rules vary depending on the desired output media. The mastering studio will have to know what desired output format(s) you're going to create, because each of these have different requirements that have to be met. Here are the rules and limitations of some of them:

CD:

Sample rate is 44100 Hz.
Bit-depth is 16-bits.
Track and index marks must be set.
ISRC codes may be applied as well.

MP3 and AAC:

Sample rate is usually 44100 Hz.
Bit-depth is 16-bits.
Proper encoder settings should be found (bitrate, lowpass filter setting etc.)
This kind of files is often listened to in headphones, so being pleasant to listen to in headphones has high priority.

Vinyl LP/EP:

Sample rate should be as high as possible.
Bit-depth should of course be as high as possible, but is not really crucial as the signal-to-noise ratio is not superb.
Equalization might be adjusted a tiny bit in order to counter for the general degradation of an analog transfer, and also for wear and tear of the LP over time.
The bass should be mono. Even though 300 Hz (CZ Media etc.) is a fairly good guess, there are a huge differences between vinyl pressing facilities in how low stereo can be accepted. The mastering studio should know this.
Sub-sonics are not allowed. The mastering studio should make sure there's nothing below appx. 20 Hz.

Old-school TV broadcast (example):

Hi8 is 32000 Hz, 8-bits. U-matic is analog. Both add a fair amount of white noise, so try to keep things loud enough to cover that.
Watch out for unwanted 15625 Hz signals which are often emitted by CRT monitors and TV equipment, and tend to sneak into the audio track.
Be conservative about bass deeper than appx. 70 Hz, as most people will not hear this anyway, and you want the rest of the signal as loud as possible.
The sigmal must be kept mono compatible because many older TVs are mono.
Remember 2 minutes of color bars with a preceeding 1 KHz tone at 0 dB for reference.

Surround sound:

I do not feel I have the necessary expertise on this subject yet. More info might be added some day.

Multi-band compression pros and cons

While multi-band compression might seem like the obvious solution to a lot of problems, it has many drawbacks that tend to sneak up on anyone trying to use it. The immediate improvement in richness has tricked many mastering engineers into believing they had improved a track, while new problems and artifacts were actually introduced as a result. There can be many reasons to this. The most obvious one is the sheer complexity of such compression. (Becoming really good at using a single-band compressor is hard enough already, so try spotting whether the release time is too high or too low in a specific frequency area? It's just not easy.)

Another problem is that most multi-band compressors use IIR (digital or analog) or convolution filters (always digital) to split frequencies. This introduces a considerable phase change at each cross-over frequency - something that is not always easy to spot, but makes the mix more muddy. The only way to avoid this is to live with a huge latency (half a second or more by using FFT), and even then, the problems at the cross-over frequencies are not entirely gone (except when both bands do not compress at all.)

The benefits of multi-band compression are obvious, and sometimes outweigh the drawbacks enough to justify using it. Deciding when and how to use multi-band compression takes a lot of skill.

Website by Joachim Michaelis