A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech
-
Updated
Apr 9, 2026 - Python
A powerful 3B-parameter, LLM-based Reinforcement Learning audio edit model excels at editing emotion, speaking style, and paralinguistics, and features robust zero-shot text-to-speech
Official source codes for the paper: EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing.
MLX implementation of IndexTTS (v1.5 & v2.0) — high-quality text-to-speech with voice cloning and emotion control, optimized for Apple Silicon
🎙️ Beautiful FastAPI WebUI for IndexTTS2 on Apple Silicon — dark theme, 8 emotions, custom reference audio, one-click NPZ speaker export
NL interface surpassing GPT limits: emotion control, parallel commands, quantified inputs like "remove 30% emotion".
Run IndexTTS2 on Apple Silicon with this FastAPI-powered web interface featuring emotion controls and custom audio support.
Add a description, image, and links to the emotion-control topic page so that developers can more easily learn about it.
To associate your repository with the emotion-control topic, visit your repo's landing page and select "manage topics."