Lies, Damn Lies, and Dithering

0 Comments

Baphometrix explains why YouTube has been lying to you about dithering

Baphometrix is a Producer-DJ specializing in festival-oriented bass music genres (with some hip-hop on the side). She is a student of ill.Gates and a member of Producer Dojo’s Class of 808.

In this article, I shed some light on a subject that many well-meaning YouTube videos about music production get completely wrong when they mention it: dithering. As I did in my last article on the subject of loudness, I’m going to start with some assertions that might surprise you, but stick with me before concluding I’m wrong:

1 – If you are exporting a track from your DAW (or bouncing a track inside your DAW) to any bit depth other than 32-bit float, you should always dither during that bounce or export.

2 – Dithering noise is always preferable to the truncation distortion caused by exporting/bouncing to a fixed-point bit depth without dithering.

3 – The best dithering algorithm is nearly always simple triangular (TDPF) with NO noise-shaping.

4 – You should always give your mastering engineer a 32-bit float pre-master. And you should always give your collab partner 32-bit float stems. But if you’re creating a sample pack for sale or a stem pack for a remix competition, you should always export those samples or stems at 24-bit with triangular dithering. (Unless you’re making samples to load into specific hardware synths that support only 16-bit samples.)

5 – If you’re exporting (or converting) songs to load into CDJs for live mixing, you should probably avoid 24-bit exports and instead export at 16-bit with triangular dithering.

Do any of these surprise you? Read on to understand why!

What exactly is dithering and why should you care?

To help understand all the stuff I’m going to say about dithering in this article, I highly recommend that you first spend 7 minutes to watch this video from Ian Shepherd, a well-known mastering engineer.

Dithering hiss is ALWAYS better than truncation distortion

As demonstrated in Ian Shepherd’s video, the tiny amount of hissy white noise added by dithering is FAR less, and not nearly as bad-sounding, than the grainy, gritty, warbly noise added by truncation distortion. It’s a simple fact that you could bounce an audio file dozens of times in a row to 24-bit fixed point with dithering, and the total amount of floor noise added across those bounces would still be far less than the amount of analog circuit noise found in the best masters of the pre-digital era.

The well-meaning but wrong advice to “never dither except during the very last master export” is based on the intuitive notion that adding noise is somehow bad. But ask yourself what all those great analog plugin emulations of famous channel strips and preamps and EQs and Compressors are doing to your pristine digital audio. They’re adding tons of harmonic distortion and noise. The right amount of floor noise simply adds warmth and pleasant, musical coloration to a dry, sterile, digital sound. Why do you think there are “Noise” oscillators in synths like Serum? Why do you think we routinely add a white noise sample to our layered kicks and snares, to our layered heavy mid basses, and to many of our layered synths like big supersaws? We do it because even a noticeable amount of added noise sounds good and serves a useful role in filling out the spectral balance of an entire mix.

You also have to consider the source of such well-meaning but wrong advice. As I pointed out in my article last month about loudness, there are a lot of engineers who really do not understand how to mix and master the heavier, louder genres of electronic dance music. An engineer who normally works with quieter genres is doing a relatively light amount of compression and limiting, which means they can’t really hear the difference between dithering noise or truncation distortion. Why? Because low-level dithering noise or truncation distortion artifacts are way down near the noise floor in modern 16- and 24-bit audio and are both effectively inaudible in quieter genres.

But louder genres like electronic dance music still need to be mastered fairly loud, which means these genres have a relatively small dynamic range. Ask yourself what happens when you reduce the dynamic range of a song by a large amount. That’s right… All the more quiet sounds in the mix become relatively much louder! This is why you typically have to re-level your mix after you squash it up to typical “loudness war” target levels.

And the problem is that truncation distortion can indeed become AUDIBLE in songs that are squashed to loudness war levels. But dithering noise added even to multiple successive 24-bit bounces is so incredibly small that the dithering noise itself will not become audible at loudness war levels!

The Ian Shepherd video I linked above demonstrates this principle really well. Look at how large the 16-bit truncation distortion sits in the spectrum, versus the 16-bit dithering noise that is nearly indistinguishable from the original signal! Listen to the sound of the truncation distortion in that video and ask if you really want to hear an “edge” to your songs like that?

ALWAYS dither when you PRINT sound to anything other than 32-bit floating point

One of the worst pieces of advice out there, and the most common, is that you should only ever dither one time, and that one time should only be when you’re cutting a final master WAV. This advice is spread around because it’s an easy rule of thumb to remember, and it’s the wrong rule of thumb because it’s easy to misunderstand what experts mean when they talk about “export” or “bounce”, etc.

The real rule here is simple: You should ALWAYS dither when you process a sound with fixed-point math, and conversely, You should NEVER dither when you process a sound with floating-point math. And there is a second nuance to this rule, which is You should only ever apply noise-shaped dither one time (if even then), to a final master that will never possibly undergo any further processing.

And this real rule means the following things:

A – When you export a 16- or 24-bit master WAV (or a set of WAV stems) from your DAW, you should dither!

B – (rare) When you export a 32-bit fixed point master WAV (or a set of WAV stems) from your DAW, you should dither!

C – When you “bounce” or “freeze/flatten” or “resample” or “consolidate” any track or bus inside your DAW into another new track, you MIGHT need to dither, depending on the way your DAW sets the bit-depth of bounced/flattened/resampled tracks. (More about this just below.)

D – You should never bother to try any of those noise shaping options you see in your dithering tool unless you are burning a final master. And even then, you might want to simply ignore all of them and apply NO noise shaping to your final master. Why? I’ll talk about that in the section on noise-shaping below.

Do points C and D surprise you? These are both telling you that you might need to deal with dithering all along as you work on a project! Long before you’re ever ready to export a master! I say might because these two points depend on your specific DAW. I’ll go over the details about this in sections further below.

So here’s the deal. Every modern DAW worth actually using for music production by now performs all its internal processing at 32-bit floating point (FP) on every track, and probably uses 64-bit floating point math at every summing point (every group, every bus, and the “master” track). In a nutshell, your DAW can push a signal back and forth through 32-bit FP paths to 64-bit FP paths and back down to 32-bit FP all day long and the signal will suffer no change, distortion, or degradation.

The problem arises when you finally move that 32/64-bit floating point signal out of memory and print it to a WAV file in a track clip. This happens every time you do any of the following commands in your DAW: bounce, resample, flatten, or consolidate. Each of these operations basically prints a special type of sample WAV file and plops it into your project folder.

Outcome 1 – If the printed sample is 32-bit floating point, then no bit truncation happens, so no dithering is needed.

Outcome 2 – If the printed sample is 16-bit or 24-bit (both of which are always fixed point), then bit truncation is happening and therefore dithering is needed!

Outcome 3 – (rare) If the printed sample is 32-bit fixed point, then bit truncation is happening and therefore dithering is needed!

Another problem is that it’s not always obvious what format your DAW prints to when you use its bounce/freeze/flatten/resample/consolidate operations! For example, let’s examine what Ableton Live and Bitwig Studio do in this regard.

In Ableton, freeze/flatten/resample/consolidate always prints a 32-bit floating point sample. You have no other choices. Therefore, you don’t need to think about dithering at all unless you are exporting stems or masters out of Ableton.

In Bitwig, the default behavior for bounce/consolidate is to print a 24-bit non-dithered sample. It wasn’t until Bitwig 2.0 that they introduced the option to configure Bitwig to either apply dithering while printing those 24-bit samples, or to instead print all samples to 32-bit floating point so that no dithering is necessary.

Why do some DAWs, like Bitwig, allow you to bounce/consolidate to 24-bit samples? Well, in part to make it easier to create samples for commercial sample packs! It’s a PITA to do this in Ableton because you must manually Export every single clip out of your project to get those samples into a 24-bit format. By contrast, Bitwig enables you to just bounce a clip directly to dithered 24-bit and now your project’s sample folder has a sample that is ready to go into a sample pack with no further processing.

This 24-bit choice for bounced samples can also be desirable for keeping projects smaller in size. Think about how many times in a real project you consolidate clips, resample clips, do freezes or freeze>flatten, and so on. In Ableton, those 32-bit samples start bloating the size of the project folder pretty quickly. By contrast, Bitwig users can have smaller projects if they choose to use 24-bit bounces.

Should you dither your collab stems or pre-masters?

I’ve already made this essential point in previous sections, but it might not have stood out. So just to make it absolutely crystal clear:

Rule 1 – The only time you shouldn’t apply dithering during export is if you’re specifically handing off 32-bit floating point exports to your mastering engineer or collab partner. This is always the BEST course, but will require a lot more cloud space (and faster internet bandwidth) for the transfer because 32-bit floating point files are BIG.

Rule 2 – If you hand off a 24-bit or 16-bit or even 32-bit fixed point export to your mastering engineer, make sure you apply dithering during that export!

Rule 3 – If you hand off a set of 24-bit or even 32-bit fixed point stems to a collab partner, make sure you apply dithering during those stem exports!

Should you dither masters that you upload to streaming platforms?

Another bad piece of advice out there is that somehow dithering doesn’t matter if you’re going to convert a master to MP3 or AAC or upload it to SoundCloud, Spotify, or any platform that converts your WAV to a lossy format. I was saddened to see this advice even in the manual for FabFilter Pro-L2, and it’s wrong. Being a tech writer myself (for my income) I can guess at many typical reasons that this error found its way into the manual, but that’s neither here nor there. Just know that this advice is wrong.

It’s wrong because any lossy conversion already piles heaps of destructive changes on the original WAV file during conversion. Any formerly inaudible (or barely audible) truncation distortion in the original WAV simply becomes magnified during the lossy conversion. So uploading a non-dithered 24-bit WAV to SoundCloud (or your distributor for all the other major platforms and stores) just makes an already shitty result even shittier.

In a very similar way, it’s also bad advice that somehow very loud masters don’t need dithering “because the distortion and noise inherent in a really loud master essentially serves the same purpose as dithering noise.” Again, I can guess at reasons why bad advice like this gets started, but it’s still horseshit. Dithering noise is ALWAYS preferable to truncation distortion!

ALWAYS set up your DAW to use 32-bit floating point for internal bounces/consolidates

Ableton users can skip this section and move on. Ableton forces every new sample printed by Freeze/Flatten, Resample, or Consolidate to be a 32-bit floating format. You cannot change this behavior.

Other DAWs (such as Bitwig) give you more rope to hang yourself with. This is great if you know what you’re doing, but not so great for new producers who don’t understand the consequences. Even worse is that by default, the bounce/consolidation operations in Bitwig default to 24-bit undithered format! Hopefully someday they change this default, but until then, you need to tweak some setup options for Bitwig to make it print only 32-bit floating point samples when you use its Bounce, Bounce in place, and Consolidate commands.

Other DAWs? I dunno. You’ll have to dig into the documentation to figure it out for yourself. Regardless, your goal should be to ensure that every single time you somehow print sound to a new track/clip inside your DAW project, the underlying printed sample is in 32-bit floating point format!

Why do this? Because there are other huge advantages to 32-bit floating point besides the fact that they don’t need any dithering noise! Producers of electronic music, especially, tend to resample > process > resample > process > resample > process > etc. many sounds in our projects, as part of our creative sound design. You might be greatly degrading the quality of a sound if in each resample step you’re truncating the bit rate down to 24-bits, then blowing it up to 32-bit floating point in the next processing step, then truncating the bit rate back down to 24-bits again, and so on.

In other words, you’re adding TWO types of changes to the original sound with every resample bounce: one caused by the truncation down to 24 bits (plus the added dithering noise), and one change caused by the processing you add after that resample bounce. If instead you kept every resample bounce at 32-bit floating point, then the only changes happening to the original sound are those applied by each new stage of processing.

Do NOT use noise-shaped dithering anywhere but in a final master export, and maybe not even then!

“Noise shaping” is a technique that changes the type of white noise added during dithering, by pushing it up into the higher (and lower) frequencies of the spectrum (even way above 20 Khz in some cases). The goal of noise shaping is to make the dithering noise itself less audible in very quiet sections of the music. The theory behind noise shaping is “because there is less real signal competing with the noise floor, you might start noticing the ‘hiss’ of the noise floor itself in quiet places like fades”.

But here’s the real deal about noise-shaping: that dithering noise? It can certainly be audible in an 8-bit song, and to some young people with great ears maaaayyyybbbbeeeee in a 16-bit song (but really, probably 95% of people couldn’t hear it at all in a blind listening test). But in a 24-bit song? Nope. Nobody is going to ever hear the dithering noise even in the quietest fades. No matter how many times you resampled tracks during your production.

If you actually do hear any noise in a 24-bit song, it’s not the dithering noise! Instead, it was noise intentionally added by the producer to their sound design, or unavoidable analog noise from the original recording equipment or the analog mixing desks and other analog gear.

And here’s the other problem: you can get yourself in trouble by picking some noise-shaping technique without really understanding it. Why? Because nothing is free, and noise shaping trades off a slightly audible noise flow down in the lower frequency ranges for more energy in the transient peaks of your song! The end result is that your mastering limiter might be set to a ceiling of 0 dB, but the noise-shaped dithering applied after this final ceiling can push some transients as high as 0.5 dB above the ceiling again.

A third problem is that in very loud masters (for genres like Metal or EDM) the artifacts caused by the digital foldback of the noise-shaping’s added frequencies above 20 Khz can become audible (albeit very very subtle). And in such cases, the different types of noise shaping you pick will audibly (but subtly) change the sound of your master (especially in the more quiet sections and fade-ins/fade-outs).

A fourth problem comes into play when somebody tries to remix sections of your master into their own song. For example, it’s very easy nowadays for spectral processing tools to strip out a really clean vocal acapella from a full mix. Or to strip out really clean drums from a mix. You then chop out a kick and snare to use in your own song, or you use sections of that extracted vocal in your song. And so on.

Think about what’s happening here… your song is including some samples with noise shaped dither baked into them! The artifacts and high amounts of noise pushed up into the 16 Khz+ region is going to be picked up and amplified by any further processing (and resample bounces) you might do to those extracted sounds. And then you’re going to slap more noise-shaped dither on top of your own final master? You’re just compounding error on error here.

So at the end of the day, you’re being more friendly to other producers down the road by not using noise-shaped dithering! And at the very least, you’re not making the problem even worse if you re-used bits and pieces from someone else’s master that had noise-shaped dithering applied!

Finally, every DAW and every mastering plugin seems to use a bewildering array of different noise shaping algorithms with funky names that don’t really suggest what they’re doing. And their doc might not even really describe the different noise-shaping algorithms very well.

So what to do? Which noise-shaping type should you choose for your final masters?

Rule 1 – If you trust your ears, monitors, and your room treatment, make sure you actually audition all the available noise shaping algorithms on every song you master. You’ll see that some noise-shaping types will just simply “sound better” than others.

Rule 2 – If you don’t trust your ears, etc. just use NO noise-shaping at all! There are many pro engineers who will argue that noise shaping is often more trouble than it’s worth, and that simple triangular dithering with NO noise shaping applied is always a good choice!

Rule 3 – Or just avoid the hype and the misinformation and don’t bother with noise-shaping at all. In a 24-bit master, the noise-shaping options won’t make any real practical difference at normal listening levels, so it’s better to just have an evenly distributed amount of dithering noise across the entire spectrum. Remember, the dithering noise at 24-bits is so incredibly tiny that you can barely see it on a spectrum analyzer, and you certainly can’t hear it!

ALWAYS use triangular (TDPF) dither, and NEVER use noise-shaping when sending 24-bit stems or pre-masters to other producers/engineers

In all the preceding sections I talk about dithering as a whole but haven’t yet mentioned which specific dithering algorithm you should use. The answer is simple: Always use “Triangular” (aka “TDPF”) dithering. Never use any other dithering type.

That said, some DAWS (and mastering plugins) sometimes obscure the line between the dithering algorithm and the noise-shaping algorithm. For example, Ableton, Ozone, etc. are all ridiculously ambiguous about this simple fact! Here’s how it really works:

Problem 1 – There are 3 different algorithms that generate the random noise used for dithering: Triangular (TDPF), Rectangular (RDPF), and Gaussian (GDPF). Each algorithm results in a slightly different type of random noise.

Problem 2 – Some vendors like iZotope even try to make strange “hybrid” dithering algorithms that don’t actually create dithering noise, but instead try to move the truncation distortion out of the sensitive Fletcher-Munson ranges! Sheesh. All these hoops to jump through! And why? This is the era of 24-bit audio. 16-bit CDs are no longer a thing. And nobody is going to put an 8-bit master out anywhere, not even in game sound tracks!

Problem 3 – the noise-shaping algorithm simply applies an EQ filter to this dithering noise! (Mostly by pushing down the lower frequencies of the noise and pushing up the higher frequencies, in a way that matches Fletcher-Munson psychoacoustics.)

Got that? The dither function produces a certain-sounding (evenly-distributed) basic white noise, and then the noise-shaping function applies a filter to that basic noise.

Problem is, if you read the Ableton doc or the Ozone doc or the Fabfilter doc, etc., they all blur this simple line and make it seem like POW-R is different from MBIT+ is different from Triangular, etc. It’s all a bunch of ambiguous, confusing bullshit.

Then there’s the fact that POW-R is ancient. It might have been a nice noise-shaping to apply to triangular dithering back in the early days of 16-bit CDs, but everything is different now. There are more modern noise-shaping algorithms that work better for modern music types, different loudness targets, and 24-bit WAVs.

IMO, the bottom line is that you should either use a fairly modern noise-shaping curve or none at all. And if you don’t have a great room and great ears and know where and how to listen for the sound that a noise-shaping curve applies to your music, you should just avoid it all entirely.

In plain English, some noise-shaping types will make your songs a wee bit too harsh in some frequency ranges and/or too dull in other frequency ranges. At the end of the day a simple Triangular (TDPF) noise shaping with NO noise shaping will always be a “safe” choice with the fewest possible downsides.

There’s also another simple fact that is hard to catch when you read all the various product docs: IF noise-shaping is applied during dithering, it should only ever be applied in the very last dither that’s ever going to happen to your song! This means that if you are sending a pre-master to your mastering engineer, or if you’re sending your own self-master to another producer/engineer who’s going to put that self-master on a compilation album or mixtape, or if you’re sending exported stems to your collab partner to bring into their DAW project, you should NEVER apply noise shaping to the dithered exports you make for these purposes.

Therefore, at the end of the day, the best, safest approach is to simply ignore ALL of the noise-shaped dithering options in your DAWs and mastering plugs. Just use plain old Triangular (TDPF) dithering with NO noise shaping. Leave the question of noise shaping to the pro-mastering engineers and just pretend noise shaping options aren’t there when you’re exporting masters or stems from your projects.

So, for example:

In Ableton, just choose Triangular and ignore all the other options.

In Bitwig, just choose Dither and don’t worry. Bitwig is smart. They use only simple Triangular dithering and don’t try to confuse you with other dithering or noise-shaping.

In Kazrog masterDither, just choose TDPF for the dithering type, None for the noise-shaping type, and Enabled for the auto-blanking behavior.

Yeah, but what about Ozone? I keep hearing about MBIT+. Is that a dithering type or a noise-shaping type? Well, It’s both. MBIT+ is just Ozone’s branded approach to dithering, and there are a LOT of confusing options inside. It’s kind of a black box in that it doesn’t make clear what the dithering and noise shaping options actually are! For example, they don’t tell you which dithering option is simple triangular (TDPF). (Spoiler: it’s most likely the Medium dither option.) Instead, they try to make it user friendly (and obfuscated) by not using the technical names of things.

My advice with Ozone 8 is therefore to use the Medium dithering option, to enable Auto-Blanking, and to turn noise-shaping Off per my comments above. My educated guess is that these settings equate to simple triangular dithering with no noise shaping.

A potential “gotcha” with 24-bit and CDJ decks

Some discussion among my DJ and producer DJ fam has surfaced that 24-bit WAVs are reported to sometimes be problematic on CDJs and the USBs you drag around to clubs and festivals. One of my more technically-inclined friends suspects that the problem is not related to 24-bit itself (since Pioneer says they support that bit depth on CDJs). Instead, he thinks it might be a combination of 24-bit AND going over 0 dB True Peak that makes some 24-bit WAVs seem to be unreadable in some CDJ decks.

At the end of the day, if you’re a touring DJ and you take USBs around to your shows to play on CDJ decks, it might be worth testing your 24-bit files locally, and it might also be worth having alternate 16-bit versions. Or just standardize on 16-bit for your USBs. Or just make sure you’re not going over 0 dB True Peak on your masters. Or any combination of the above.

Update – Jan 29, 2019

Ian Shepherd just put out another video explaining why you should always apply dithering every single time you print sound to anything other than 32-bit float. It’s a short 8-minute video and he uses null tests to demonstrate why dithering always sounds better than not dithering. Check it out!

Thanks for hanging with me until the very end! I know this is chewy stuff. (phew!) Next month I’ll bust a bunch of myths about sample rate. Stay tuned!