🎙️ VyvoTTS Dataset Tokenizer
Process audio datasets for VyvoTTS training by tokenizing both audio and text.
Instructions:
- Enter your HuggingFace token (required for downloading and uploading datasets)
- Provide the original dataset path from HuggingFace Hub
- Specify the output dataset path where processed data will be uploaded
- Select the model type (Qwen3 or LFM2)
- Specify the text field name in your dataset
- Click "Process Dataset" to start
Note: This process requires a GPU and may take several minutes depending on dataset size.
📝 Example Values:
For Qwen3:
- Original Dataset:
MrDragonFox/Elise - Output Dataset:
username/elise-qwen3-processed - Model Type:
qwen3 - Text Field:
text
For LFM2:
- Original Dataset:
MrDragonFox/Elise - Output Dataset:
username/elise-lfm2-processed - Model Type:
lfm2 - Text Field:
text
⚠️ Requirements:
- GPU with CUDA support
- HuggingFace account with write access
- Valid HuggingFace token