This directory contains the audio datasets for training custom RVC models.
Each subdirectory corresponds to a specific voice type:
male_low/: Bass/Baritone male voicesmale_mid/: Tenor/Mid-range male voicesfemale_low/: Alto/Contralto female voicesfemale_high/: Soprano/High-range female voicesanime_airy/: Breath/Airy anime-style voicesaccent_non_native/: Voices with distinct non-native accentssinging_male/: Male singing vocalssinging_female/: Female singing vocalschild/: Child voiceselderly/: Elderly voices
- Collect Audio: Gather 10-15 minutes of clean, single-speaker audio for the desired category.
- Place Files: Put the raw audio files (mp3, wav, etc.) into a temporary folder or directly here.
- Process: Use the provided tool to normalize and split the audio.
# Example: Processing a raw file into the male_low dataset
python tools/audio_preprocessor.py -i raw_audio/my_voice.mp3 -o datasets/male_low- Format: WAV (will be converted automatically)
- Sample Rate: 40kHz or 48kHz (will be converted automatically)
- Channels: Mono (will be converted automatically)
- Quality: No background noise, music, or reverb. Use UVR5 to clean if necessary.