Skip to main content

Voice Integration

Voice integration components for building voice-enabled RecoAgent applications.

Overview

The voice integration system provides comprehensive voice capabilities including speech-to-text, text-to-speech, and voice command processing.

Core Features

  • Speech-to-Text: Convert speech to text
  • Text-to-Speech: Convert text to speech
  • Voice Commands: Process voice commands
  • Real-time Processing: Real-time voice processing
  • Multi-language Support: Support for multiple languages

Usage Examples

Basic Voice Integration

from recoagent.voice.integration import VoiceIntegration

# Create voice integration
voice_integration = VoiceIntegration(
stt_provider="openai",
tts_provider="elevenlabs"
)

# Process voice input
text_result = voice_integration.speech_to_text(
audio_file="voice_input.wav",
language="en-US"
)

print(f"Transcribed text: {text_result.text}")

Advanced Voice Processing

from recoagent.voice.integration import AdvancedVoiceIntegration

# Create advanced voice integration
advanced_voice = AdvancedVoiceIntegration(
stt_config={
"provider": "openai",
"model": "whisper-1",
"language": "en"
},
tts_config={
"provider": "elevenlabs",
"voice": "alloy",
"speed": 1.0
}
)

# Process voice command
command_result = advanced_voice.process_voice_command(
audio_file="command.wav",
command_types=["search", "navigate", "execute"]
)

if command_result.recognized:
print(f"Command: {command_result.command}")
print(f"Parameters: {command_result.parameters}")

API Reference

VoiceIntegration Methods

speech_to_text(audio_file: str, language: str = "en-US") -> STTResult

Convert speech to text

Parameters:

  • audio_file (str): Path to audio file
  • language (str): Language code

Returns: Speech-to-text result

text_to_speech(text: str, voice: str = "default") -> TTSResult

Convert text to speech

Parameters:

  • text (str): Text to convert
  • voice (str): Voice to use

Returns: Text-to-speech result

See Also