speech-to-speech / README.md
Rcarvalo's picture
Upload README.md with huggingface_hub
23d6a6e verified
---
title: LFM2-Audio Real-time Speech-to-Speech
emoji: πŸŽ™οΈ
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
license: other
---
# LFM2-Audio Real-time Speech-to-Speech Chat
Real-time WebRTC streaming demo of LFM2-Audio-1.5B, Liquid AI's first end-to-end audio foundation model.
## ✨ Features
- **πŸ”΄ Real-time WebRTC streaming** - Instant response with minimal latency
- **πŸŽ™οΈ Continuous listening** - Natural conversation flow with automatic pause detection
- **πŸ’¬ Interleaved output** - Simultaneous text and audio generation
- **πŸ”„ Multi-turn memory** - Context-aware conversations
- **⚑ Low latency** - Optimized for real-time interaction
## πŸš€ How to Use
1. **Grant microphone access** when prompted by your browser
2. **Start speaking** - The model listens continuously
3. **Pause briefly** - The model detects pauses and responds automatically
4. **Continue conversation** - Build multi-turn dialogues naturally
## πŸŽ›οΈ Parameters
### Temperature
- **0**: Greedy decoding (most deterministic)
- **1.0**: Default (balanced creativity and coherence)
- **2.0**: Maximum creativity (more diverse outputs)
### Top-k
- **0**: No filtering (full vocabulary)
- **4**: Default (conservative, high quality)
- **Higher values**: More diverse but potentially less coherent
## πŸ—οΈ Technical Details
- **Model**: LFM2-Audio-1.5B
- **Generation Mode**: Interleaved (optimized for real-time)
- **Audio Codec**: Mimi (24kHz)
- **Streaming**: WebRTC via fastrtc
- **Backend**: PyTorch with CUDA acceleration
## πŸ”§ Differences from Standard Demo
This demo uses **fastrtc** for WebRTC streaming, enabling:
- Continuous audio streaming without manual recording
- Automatic voice activity detection (VAD)
- Lower latency through chunked processing
- More natural conversation flow
## πŸ“š Resources
- [Liquid AI Website](https://www.liquid.ai/)
- [GitHub Repository](https://github.com/Liquid4All/liquid-audio/)
- [Model on Hugging Face](https://huggingface.co/LiquidAI/LFM2-Audio-1.5B)
- [fastrtc Documentation](https://github.com/freddyaboulton/fastrtc)
## πŸ“ License
Licensed under the LFM Open License v1.0
## πŸ’‘ Tips
- Speak clearly and pause briefly between thoughts
- Use a good quality microphone for best results
- Adjust temperature for different creativity levels
- Lower top-k values produce more consistent responses
- GPU acceleration is recommended for real-time performance