Fine-Tuned Xtts Model

This project fine-tunes a TTS (Text-to-Speech) model using an mp3 file extracted from a YouTube video. The training was conducted on a Hugging Face Space running locally via Docker. A GPU is recommended for faster training.

Training Data

  • Source Video: YouTube Video
  • Training Audio: The mp3 file used for training is included in the files directory.

dockerimage

Fine tuned with this docker image FineTune Xtts Docker image

Notes

  • Ensure you have a GPU available for optimal performance during training.
  • The Docker image pulls the latest version each time it's run.

This model is based on xtts v2 which cannot be used commercially as per the xtts license which is in a limbo state

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .

Model tree for drewThomasson/xtts-finetune-John-Butler-Author-ASMR-voice

Base model

coqui/XTTS-v2
Finetuned
(27)
this model