Does Youtube lowpass audio at 16k?
Response to Dan Worrall's video
In a recent video, Dan Worrall uploads a test audio file to youtube to investigate an often-repeated claim that youtube introduces a lowpass filter at 16kHz. This file, only about 9 seconds in length, contains a short linear frequency sweep, as well as a period of noise. Dan then plays this video back at several quality levels and uses this as the basis of his analysis.
As a software engineer who has been playing with youtube audio for quite some time, as well as a listener to youtube's audio offerings (Youtube videos & Youtube Music - RIP Google Play), I too am interested in helping get to the bottom of this.
Background
Youtube allows videos to be played back at several different quality settings. Hopefully this statement is not a revelation to anyone, as it is nothing new; youtube has been serving videos at multiple quality levels for almost two decades at this point. What is less widely known, is that youtube separates the audio and video and serves these separately. This is part of what allows Youtube Music to exist, as well as those with youtube premium to listen to videos in the background without consuming video bandwidth.
These separate audio formats, however, are relatively hidden from the user. The menu options available on a video itself only allow choosing between video quality levels. What's more, is that these formats are device-targeted, like the video formats. Many may not know that youtube is in the middle of a format migration (and has been for the past several years). H.264 video with AAC audio (the age-old .mp4 that youtube started with) is deprecated in favor of VP9 video with OGG-OPUS audio (using the .webm container). Youtube keeps transcoding to H.264 video and AAC audio for legacy devices that are unable to understand this newer format.
Available Formats
With that in mind, let's take a closer look at the formats available for Dan's test video. The full table has quite a bit of info in it, so this table is abbreviated to only show the relevant formats that contain audio.
| ID | Type | Video Res. | Filesize | Total Bitrate | Audio Codec | Audio Bitrate | Sample Rate | More Info | Sample |
|---|---|---|---|---|---|---|---|---|---|
| 139 | m4a | audio only | 52.66KiB | 49k | mp4a.40.5 | 49k | 22050Hz | low, m4a_dash | Play Sample |
| 249 | webm | audio only | 38.49KiB | 36k | opus | 36k | 48000Hz | low, webm_dash | Play Sample |
| 250 | webm | audio only | 64.68KiB | 61k | opus | 61k | 48000Hz | low, webm_dash | Play Sample |
| 140 | m4a | audio only | 137.55KiB | 130k | mp4a.40.2 | 130k | 44100Hz | medium, m4a_dash | Play Sample |
| 251 | webm | audio only | 107.79KiB | 102k | opus | 102k | 48000Hz | medium, webm_dash | Play Sample |
| 17 | 3gp | 176x144 | 85.87KiB | 81k | mp4a.40.2 | 0k | 22050Hz | 144p | Play Sample |
| 18 | mp4 | 640x360 | 566.35KiB | 535k | mp4a.40.2 | 0k | 44100Hz | 360p | Play Sample |
| 22 | mp4 | 1280x720 | ~ 1.72MiB | 1564k | mp4a.40.2 | 0k | 44100Hz | 720p | Play Sample |
Feel free to download each of these clips and load them in your analysis tool of choice, in fact I encourage it. You should never take one person's analysis at face value, as there is always the possibility it is flawed.
Methodology
Each of these files has been download from YouTube's servers without transcoding. The container has been fixed with ffmpeg (an automatic function of yt-dlp/youtube-dl) to correct for dash container differences for portability, but this in no way changes the media data. These files were all downloaded with the yt-dlp python tool, with analysis done with ffmpeg's showspectrumpic filter with default settings. A linear frequency plot was chosen to increase resolution at the highest frequencies on the plot, as the question at answer here is the cut-off point. Additionally, the frequency sweep performed by Dan was linear, and it shows up more nicely on a linear plot.
Results
This brings us back to our question. Does Youtube lowpass audio at 16k?. The answer is a little complicated. Let's look at some spectrograms.
3gp - Legacy Mobile Video.
The worst and most laughable case is the 3gp video format,
encoded with a sample rate of 22kHz. This has a clear cut-off at approximately 8kHz,
which is unsurprising given the sample rate (cutoff is ~3kHz under nyquist). This is an old
format, which is really only still around for legacy mobile and embedded platforms. I would
be quite surprised if it was used with any regularity.
MP4 - Legacy video, bundled with audio
These formats are bundles, where the video and the audio are encoded into the same container. These were the formats that used to power youtube back in the late 2000s and early 2010s. Because the audio and video are not separate in these formats, the links above show the whole video.
The sample rate of these clips is 44.1kHz, and so a cut-off around 19-20kHz is what
would be expected. Here, however, we see a cut-off much closer to 15kHz on the 360p
format and 16kHz on the 720p format. Whether this is a result of a new video being run
through old formats, or if really old videos also suffer from a low-pass as low as this
is an interesting question, but outside the scope here.
webm / ogg-opus - Modern AV Codec.
Next up are the modern codecs. OGG-OPUS is renowned for its versatility and ability to maintain decent audio fidelity even at low bitrates. For this reason, it is beginning to supplant many other domain-specific formats in applications such as audio streaming, internet voice chat, digital radio telephony, and others.
These audio samples represent what the vast majority of users will experience, as newer web browsers, mobile apps, and even most newer smart TVs will utilize these formats. One potential exception is on iOS, where Apply only implemeted support for VP9 in iOS 14. Whether this would also prevent these devices from utilizing the higher quality opus audio is not easily answerable in a quick google search.
In all three cases, we see a cutoff around 20kHz. Given the audio sample rate of 48kHz, it is a bit surprising that the cutoff isn't higher, but Dan doesn't mention what sample rate he sent his encoded file up to youtube at. If the original audio was encoded at 48kHz, we should see a higher cut-off frequency than this.
It is interesting to see the artifacts introduced by the heavy compression, especially
at the 36kbit level. There appears to be a some sort of weird aliasing effect, but it's
not clear why this exists. Perhaps a result of extreme rounding in the cosine transforms?
m4a - Legacy Dash Audio.
These formats represent a transition point, supporting devices new enough to understand DASH,
thus being able to take advantage of audio quality switching irrespective of video quality,
but that don't understand the newer opus codec. Just as with the legacy mp4 format where the
audio was baked in to the media files, we see a 16kHz cut-off point.
Conclusion
The "Myth" that YouTube introduces a low-pass filter around 16kHz is no myth, but fortunatly for content creators and viewers alike, the vast majority of users avoid the formats where this is introduced.
If any further questions, corrections, or additions are warranted, feel free to email this feedback to corrections@tyzoid.com.