Hume AI launches personalized synthetic voices with voice control


Sign up for our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Find out more


HumeAIthe startup specializing in emotionally intelligent voice interfaces, launched voice controlan experimental feature that allows developers and users to create custom AI voices through precise modulation of vocal characteristics – no coding skills, on-the-fly AI engineering, or sound design required.

This version builds on the foundation laid by the company’s previous Empathic Voice Interface 2 (EVI 2), which introduced advanced features in terms of naturalness, emotional responsiveness and personalization.

Both EVI 2 and Voice Control avoid the risks of voice cloning, a practice that Cowen says poses ethical and practical challenges.

Hume instead focuses on providing tools to create unique and expressive voices aligned with user needs, such as customer service chatbots, digital assistants, tutors, guides or accessibility features.

Go beyond pre-set AI voices to tailored custom solutions

Voice Control gives developers the ability to adjust voices along 10 distinct dimensions, including:

“Male/Feminine: The vocalization of the gender, which varies between the most masculine and the most feminine.

Assertiveness: The firmness of the voice, which varies between shy and bold.

Buoyancy: The density of the voice, which varies from deflated to lively.

Trust: The confidence of the voice, which varies between shy and confident.

Enthusiasm: Excitement within the voice, varying between calm and enthusiasm.

Nasality: The openness of the voice, ranging from clear to nasal.

Relaxation: The accent within the voice, which varies from tense to relaxed.

Smoothness: The structure of the voice, which varies between soft and staccato.

Tepidity: The liveliness of the voice, ranging from lukewarm to vigorous.

Held: The containment of the voice, which varies between tense and whispered.

This no-code tool allows users to optimize vocal attributes in real-time via virtual sliders on the screen. It is currently available in Hume’s virtual playground, which requires a free user registration to access.

The release addresses a critical point in the AI ​​industry: the reliance on preset voices, which often fail to meet the specific needs of brands or applications, or the risks associated with voice cloning.

This focus on personalization aligns with Hume’s broader goal of developing vocal AI rich in emotional nuance.

The company’s efforts to advance voice AI were highlighted in September 2024 with launch of EVI 2, which the company described as a significant upgrade over its predecessor.

EVI 2 improved latency by 40 percent, reduced costs by 30 percent, and expanded voice modulation capabilities, giving developers a more secure alternative to voice cloning.

Cursors > text instructions

Hume’s research-oriented approach plays a central role in product development. The company, co-founded by former Google DeepMinder Alan Cowen, uses a proprietary model based on cross-cultural voice recordings paired with emotional survey data.

This methodology, rooted in the science of emotions, forms the backbone of both EVI 2 and the new Voice Control.

Voice control extends these principles by addressing the granular, often ineffable ways in which humans perceive voices.

The tool’s slider-based interface reflects common perceptual qualities of the voice, such as buoyancy or assertiveness, without attempting to oversimplify these attributes through text-based instructions.

Voice Control is available immediately in beta and integrates with Hume’s Empathic Voice Interface (EVI), making it accessible for a wide range of applications.

Developers can select a basic voice, change its characteristics, and preview the results in real time. This process ensures reproducibility and stability between sessions, key features for real-time applications such as customer service bots or virtual assistants.

The influence of EVI 2 is evident in the Voice Control capabilities. The previous model introduced features such as suggestions during conversations and multilingual capabilities, which expanded the scope of voice AI applications.

For example, EVI 2 supports sub-second response times, allowing for natural, instant conversations. It also allows dynamic adjustments to conversational style during interactions, making it a versatile tool for businesses.

Differentiate yourself in a competitive market

Hume’s focus on voice personalization and emotional intelligence positions it as a strong competitor in the voice AI space, even against well-funded rivals like OpenAI with its advanced voice mode and ElevenLabs, both of which offer voice libraries preset.

Hume continues to develop its innovative approach to speech AI. Plans to expand voice control include introducing additional editable dimensions, refining voice quality for extreme changes, and increasing the range of basic voices available.

With the launch of Voice Control, Hume strengthens its position as a leader in voice AI innovation, offering tools that prioritize personalization, emotional intelligence and real-time adaptability. Developers can access voice control today through the Hume platform, marking another step forward in the evolution of AI-powered voice solutions.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *