Here's the thing: different platforms have wildly different ideas about how loud audio should be. Master for Spotify and suddenly your YouTube video sounds like a whisper. Miss Apple's target and your podcast gets rejected. Get it right, though, and your audio sounds exactly how you intended it on every platform. Wrong specs? The platforms themselves will "fix" it—and that almost always means degrading what you've carefully built.
Spotify (and Spotify for Podcasters)
Let's start with the most forgiving streaming platform. Spotify normalizes everything to -14 LUFS during playback, which means you should master to that same target. Keep your true peak below -1 dB. For uploaded files, hit at least 128 kbps bitrate, though 192-256 kbps is better if you can manage it. Technically, Spotify can play any sample rate, but 44.1 or 48 kHz is the sweet spot.
Here's what's happening behind the scenes: if you deliver something louder than -14 LUFS, Spotify turns it down automatically. If it's quieter, Spotify turns it up (unless your listeners have that feature disabled). Either way, it's doing extra processing. Master correctly to -14 LUFS and Spotify doesn't touch it.
The reality? Voice content doesn't need much bitrate. 128-192 kbps is genuinely fine for podcasts. Music or complex audio might need more, but if you're primarily speaking, you're good. Upload in MP3 if that's what Spotify prefers, or AAC if you want better quality at lower bitrate.
YouTube
YouTube's also -14 LUFS, but with a catch. Instead of normalizing everything like Spotify does, YouTube just takes whatever it gets and encodes it. If you're over -14 LUFS, YouTube reduces it. If you're under, it doesn't necessarily boost—you just sound quieter than everyone else's videos. Nobody wants that.
The specs: -14 LUFS, -1 dB true peak, 48 kHz sample rate (matches video standards), 384+ kbps bitrate. Use AAC or MP3. Why the higher bitrate? Because YouTube will compress it further, and you want to start with high-quality source material so YouTube's processing doesn't degrade it too much.
Think about it this way: video audio is sitting alongside music intros, dialogue, effects, all packed into that video container. You need clean, well-spaced levels so nothing gets muddy when platforms do their encoding thing.
One more thing about video: make sure your audio actually syncs with your video. Nobody cares if your levels are perfect if the audio drifts out of sync with what people see on screen.
Apple Podcasts
Apple's stricter. They want -16 LUFS (with a tolerance of ±1 dB, so basically -15 to -17). -1 dB true peak. 44.1 kHz sample rate. Either MP3 or M4A format.
The bitrate? Mono voice can live at 96-128 kbps without quality issues. If you're doing stereo or higher-quality content, push to 128-192 kbps. Apple's actually pretty picky about validation—they'll reject submissions with clipping, distortion, or wildly incorrect levels. It's why hitting their target precisely matters more than with other platforms.
For most people making podcasts, you're recording mono voice. Mono at 128 kbps through Apple's system sounds good. You don't need stereo, and higher bitrates don't add much benefit when you're just talking.
Audiobook Platforms (ACX, Google Play, Kobo, Findaway)
Here's where it gets complex because different platforms want different things. Fair warning: ACX doesn't currently accept AI-narrated content, but when they do, the specs are strict.
ACX/Audible aims for -23 to -18 LUFS, -3 dB peak maximum, -60 dB noise floor minimum, 44.1 kHz, MP3 at 192 kbps constant bitrate (CBR, not variable). That noise floor requirement is brutal—any background hum or electrical noise and you fail validation.
Google Play Books is more forgiving: -16 to -20 LUFS, -3 dB peak, 44.1 kHz, MP3 or M4A. They also require per-chapter files, so you're not delivering one monolithic 40-hour file.
Kobo sits somewhere between them. Similar loudness to Google Play, MP3 preferred, chapter files required.
Findaway Voices (which distributes to many platforms) wants -18 to -23 LUFS, -3 dB peak, MP3 or WAV. They're trying to meet ACX-adjacent specs since many audiobooks end up on multiple platforms.
The key insight? Audiobooks are quieter and more controlled than podcasts. That -3 dB peak limit is half of what podcasts use. This is because audiobooks are long-form narration where consistency and low distortion matter more than loudness. Your listeners are sitting with headphones for hours, and platform normalization is less aggressive here.
Instagram, TikTok, and Short-Form Platforms
Instagram/Reels: -14 LUFS, 44.1 or 48 kHz, AAC in MP4. Short content, so the duration doesn't matter much—platform limits apply.
TikTok: Also roughly -14 LUFS, AAC format, duration limited by video length (typically 10 minutes max, but most creators stay under 3).
Amazon Alexa Flash Briefings: -14 LUFS, MP3 format, 48 kbps minimum bitrate, accepts 22.05, 24, or 16 kHz sample rates. This is ancient by modern standards, but that's what Flash Briefings support.
The pattern here? Short-form and streaming platforms standardize on -14 LUFS because they want consistent loudness across all content. Long-form content (audiobooks) goes quieter because people listen actively for extended periods.
How to Actually Export This Stuff
You've got two paths: use presets or do it manually.
The preset path is faster. Most professional audio software (including Vois, actually) includes export presets that handle all this automatically. You pick "YouTube" and the software sets -14 LUFS, 48 kHz, AAC, whatever's needed. You hit export. Done. The software does the loudness processing, sample rate conversion, encoding—all of it.
If you're building your own presets or exporting manually, you need a loudness meter. Real-time analyzers show you integrated LUFS (the overall loudness), short-term LUFS (recent loudness), and true peak (the highest single sample). Master until your integrated LUFS sits at your target, true peak stays below limit, and the audio sounds natural when you listen to it.
Here's the thing about loudness meters: they're measuring perception, not just volume. Humans perceive different frequencies differently, so a loudness meter doesn't just check dB—it weights the measurement based on how human ears actually work. That's what LUFS is. It's not just dB. Don't mix them up.
Verification Steps
After you export, actually listen. Does it sound right? Does the voice sound natural or compressed/distorted? Are levels consistent start to finish? That's your first check.
Then verify the specs. Meter check: does your loudness sit at target? Does true peak stay under the limit? Good.
Finally, if you can, test on the actual platform. Upload a test episode or segment. Listen on their system. Does it sound like your master or does it sound degraded? Are the levels competitive with similar content on that platform? That tells you if your specs are actually right.
The Multi-Platform Problem
Here's the painful bit: if you need to distribute to multiple platforms with different requirements, you've got choices.
Option one: create separate masters for different platform groups. Master one at -14 LUFS for Spotify/YouTube, another at -16 LUFS for Apple, another at -20 LUFS for audiobooks. More work, perfect results.
Option two: use the most restrictive spec and export for everyone from that. Mastering at -20 LUFS works on audiobook platforms and Spotify (it'll just be quieter on Spotify, but not rejected). This sacrifices some loudness on streaming platforms but guarantees acceptance everywhere.
Option three: export once, then use platform-specific processing tools (many distribution services offer this) to adjust for each platform. You upload one file, they handle the conversions. You're trusting them, but it works.
Most people do option two—master at -18 to -20 LUFS and distribute everywhere. You lose some presence on loud streaming platforms, but your audio works universally. It's the safe choice.
The final word: get your specs right and platforms don't touch your audio. They just play it. That's when everything sounds like you actually intended.