Here are two Finnish models of the F5-TTS, listen speech samples for both models.
The Common Voice and Vox Populi Finnish datasets are used for the first round.
20241206
Epochs: 200
Speakers: Multiple speakers from different corpus
Use these with "f5-tts_infer-gradio":
Model: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_common_voice_fi_vox_populi_fi_20241206.safetensors
Vocab: hf://AsmoKoskinen/F5-TTS_Finnish_Model/vocab.txt
The second round is based on the Common Voice, LibriVox and Vox Populi Finnish data sets.
20241217
Epochs: 200
Speakers: Multiple speakers from different corpus
Use these with "f5-tts_infer-gradio":
Model: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_commonvoice_fi_librivox_fi_vox_populi_fi_20241217/model_last_20241217.safetensors
Vocab: hf://AsmoKoskinen/F5-TTS_Finnish_Model/model_commonvoice_fi_librivox_fi_vox_populi_fi_20241217/vocab.txt
Numbers cannot be understood by both models. Convert numbers in words.
Model tree for AsmoKoskinen/F5-TTS_Finnish_Model
Base model
SWivid/F5-TTS