Meeting ACX Audiobook Requirements With AI-Generated Voices
TLDR:ACX has five hard technical requirements: loudness between -23 and -18 LUFS, peaks below -3 dB, noise floor below -60 dB, room tone at head and tail, and MP3 192kbps or higher. AI-generated audio clears four of these by default and the fifth (noise floor) is actually easier to hit with AI than with human recordings. Here's the mastering chain and the export settings that work.
Frequently Asked Questions
Can I publish AI-generated audiobooks on ACX?
ACX updated its guidelines in 2023 to allow AI-narrated audiobooks, provided the publisher discloses the use of AI. You own the narration rights and can claim royalties the same way a human-narrated audiobook does. Check ACX's current submission terms before you publish, as the policies continue to evolve.
What is the ACX loudness requirement?
ACX requires RMS between -23 dB and -18 dB, peak values not exceeding -3 dB, and a noise floor below -60 dB. Most publishers target -20 to -21 RMS to leave room for variation. Vois's ACX export preset targets these values automatically.
How long does a typical AI-narrated audiobook take to produce?
A 10-hour audiobook with AI voices takes 2 to 5 production days end to end: half a day for script formatting, 4 to 8 hours of generation time (running while you do other things), 1 to 2 days of review and regeneration of awkward passages, half a day for mastering and export. A human-narrated equivalent runs 4 to 8 weeks.
Will ACX reject my submission if the narration sounds AI-generated?
Not if the audio meets the technical requirements and you disclose the AI narration. ACX's QA focuses on audio quality (loudness, noise, consistency), not whether a voice is human or synthetic. A well-mastered AI narration passes the same QA as a well-mastered human one.
What sample rate and format should my ACX audiobook be in?
ACX requires MP3 at constant bitrate 192kbps or higher, 44.1kHz sample rate, mono or stereo (mono is standard for narration). Each file should be a separate chapter, with room tone at the head and tail.
Written by
Vois Team
Product Team
The team behind Vois, building the future of AI voice production.
Related articles
The Multi-Track Timeline: Editing AI Voice Productions Like a Pro
Most TTS tools hand you a single audio file and wish you luck. Vois gives you a multi-track timeline where you arrange, layer, and mix everything in one window. Here's how to use it.
From Manuscript to Audiobook: A Complete AI Production Workflow
A step-by-step guide for indie authors who want to produce audiobooks with AI voices, from manuscript preparation through final export.
How to Localize a YouTube Video Into 10 Languages Without Reshooting
One YouTube video. Ten languages. Zero reshoots. Here's the practical workflow creators are using to localize content without a translation agency or a second camera day.