Vois
Back to Blog
Creator Guides

Getting Started with Voice Blending

Vois TeamVois Team
October 24, 2025
6 min read

TLDR:Blend 2-4 voices with weight ratios to create unique voices—start with 50/50 blends of complementary voices, then adjust weights based on which characteristics you want to emphasize.

Here's the thing: you've probably listened to plenty of AI voices, and they're good. But sometimes you want something more specific. A voice that's warm but professional. A narrator with just a hint of British charm without the full accent. A character that feels completely unique.

That's where voice blending comes in. It's genuinely one of the most fun parts of voice production, and it's way simpler than it sounds.

Magic Wand

The Basic Idea

Instead of choosing one voice from our library, you blend two to four voices together. You decide how much of each voice to use, and Vois mixes them into a single unified voice. It's like making a custom paint color—add more red, less blue, a hint of yellow. Except for voices.

When you blend af_heart:0.6 + bf_emma:0.4, you're saying "take 60% of af_heart's warmth and combine it with 40% of bf_emma's clear British precision." The result? Something that sounds natural and completely yours.

The percentages—we call them weights—are what make this work. They tell the system which voice's personality should shine through more.

Weight Ratios: The Foundation

Think of weights like a recipe. They've gotta add up to 1.0 (or 100%) total.

A 50/50 blend is perfectly balanced. Both voices contribute equally, so you get a genuinely hybrid result. A 70/30 split means one voice is the star, and the other adds subtle character. If you go really extreme—like 90/10—you're mostly getting one voice with just a whisper of the other.

The sweet spot for most people? Start with 60/40 or 70/30. One voice as your foundation, another to add what's missing.

Finding Voices That Actually Work Together

Not every combination sounds great. (Sorry, but it's true.) The magic happens when you pick voices that complement each other instead of fighting.

Want warmth with authority? Pair a genuinely approachable voice with one that sounds more commanding. Think about it: you're not trying to make them sound the same. You're trying to combine their best qualities. Warmth from one, professional edge from the other.

Energy plus clarity works beautifully. One voice bouncy and enthusiastic, the other measured and easy to follow. Together? You get something dynamic but not chaotic.

You can even play with regional flavor. Mix American and British voices to create something that doesn't fit into either category. It's honestly weird (in a good way).

What doesn't work? Two very similar voices usually create barely any difference. Voices with drastically different paces can sound unnatural. And conflicting accent patterns? They'll clash.

Collaboration

Your First Blend: A Simple Path

Here's how to actually do this without overthinking it:

Pick a voice you already like as your base—something close to what you want. Then be honest: what's missing? Is it warmer? More energy? A different accent flavor? Now find a second voice that has that thing.

Start at 70/30 with your favorite as the majority. Generate a sample with actual content from your project—not just "hello world," but real material. Listen to it. If you want more of the second voice's character, shift to 60/40. If you want less, go to 80/20.

Move in 10% steps. Seriously, don't overthink it. Your ears know when something sounds right.

Blends You Can Start With Today

These combinations work beautifully right now:

If you want a voice that's warm but still professional, try af_heart:0.6 + af_alloy:0.4. The first voice is gentle and approachable. The second adds polish. They're both American, so there's cohesion.

Want British with American friendliness? bf_emma:0.65 + af_nova:0.35 does exactly that. You get the precision of British speech patterns with slightly more warmth sneaking in.

For documentary work, bm_daniel:0.5 + am_adam:0.5 is a powerhouse 50/50. Daniel brings authority. Adam brings tutorial warmth. Together? Credible but accessible.

Three voices for something really distinctive? af_sky:0.4 + bf_lily:0.35 + af_heart:0.25. Sky contributes one flavor, Lily adds a different energy, Heart softens the whole thing. It's textured.

When You Want Multiple Characters

Here's something cool: you can use blending to create distinct characters for dialogue-heavy projects.

Character A might be am_adam:0.7 + bm_george:0.3. Character B could be bf_emma:0.6 + af_nova:0.4. They're different enough that listeners know they're different people, but they were built from the same 54-voice library. Consistency with variety.

The more voices you blend, the more control you get. But—and this is important—don't go overboard. Keep one voice as your anchor (40%+). Use the others for accent notes (15-30% each). Too many voices at equal weight sounds like someone's trying too hard.

The Real Secret: Testing

This is where most people get it wrong. They create a blend and assume it's done. Wrong move.

Generate samples with your actual content. Not a sentence. A paragraph. Multiple paragraphs. Listen to how the voice handles rhythm, emotion, pacing. Does it work for dialogue? For narration? For technical content?

Compare the blend against the individual source voices. You might discover you actually prefer one of the originals. That's fine—at least you know.

Different content types sound different through the same voice. A podcast conversation needs different qualities than an audiobook. Test both if your project needs both.

The best blend is the one where listeners forget they're listening to an AI. They're just absorbed in the content.

Happy Vibes

Save It So You Can Use It Again

Once you nail a blend you love, save it as a preset. Write down the exact voice IDs and weights. Name it something you'll remember—"Documentary Host" or "Podcast Warmth" or "Character A: The Mentor."

Presets matter because consistency matters. When you come back to a project six months later and need to generate more audio in the same voice, you want it to sound exactly the same. A preset makes that instant. No guessing, no tweaking. Just perfect consistency.

Blending vs. Switching: What's the Difference?

Blending creates one unified voice. You're mixing them together at the source. Voice switching is different—you're literally alternating between distinct voices for different speakers. One person sounds like Voice A, another like Voice B. Completely different use case.

Blending is for when you want one unique voice. Switching is for when you want multiple distinct speakers. Use each for what it's actually designed for.

Where To Go From Here

Start with two voices. Try a 60/40 or 70/30 split. Listen. Adjust. You've got 54 voices to work with, and the combinations? Theoretically endless.

Every blend you create is something that didn't exist before. That's the magic of it.

Frequently Asked Questions

What is voice blending?

Voice blending combines characteristics from multiple AI voices into a single output. You assign weights (percentages) to each voice, and the system generates speech that merges their characteristics.

How many voices can I blend together?

Vois supports blending up to 4 voices simultaneously. Each voice receives a weight value, and the weights should sum to 1.0 (100%).

Voice BlendingTutorialsTips
Share:
Vois Team

Written by

Vois Team

Product Team

The team behind Vois, building the future of AI voice production.