Sindhi-TTS
This model is a fine-tuned version of fahadqazi/Sindhi-TTS on the None dataset. It achieves the following results on the evaluation set:
- eval_loss: 0.4602
- eval_runtime: 47.8291
- eval_samples_per_second: 36.421
- eval_steps_per_second: 18.211
- epoch: 13.2653
- step: 6500
How to use
from transformers import SpeechT5ForTextToSpeech, SpeechT5ForSpeechToText
from transformers import SpeechT5Processor
from transformers import AutoTokenizer
from transformers import SpeechT5HifiGan
import torch
from IPython.display import Audio as IPythonAudio
device = "cuda" if torch.cuda.is_available() else "cpu"
# imporing speech processor from another repo
processor = SpeechT5Processor.from_pretrained("Sana1207/Hindi_SpeechT5_finetuned")
# importing tokenizer and assigning it to the speech processor
tokenizer = AutoTokenizer.from_pretrained("fahadqazi/Sindhi-TTS")
processor.tokenizer = tokenizer
# importing the model
model = SpeechT5ForTextToSpeech.from_pretrained("fahadqazi/Sindhi-TTS")
# importing the vocoder from microsoft's repository
vocoder = SpeechT5HifiGan.from_pretrained("microsoft/speecht5_hifigan").to(device)
# loading random vocodings (the voice)
embeddings_dataset = load_dataset("Matthijs/cmu-arctic-xvectors", split="validation")
speaker_embeddings = embeddings_dataset[7306]["xvector"]
speaker_embeddings = torch.tensor(speaker_embeddings).to(device).unsqueeze(0)
# Generating Speech
text = "ڪهڙا حال آهن"
inputs = processor(text=text, return_tensors="pt").to(device)
speech = model.generate_speech(inputs["input_ids"], speaker_embeddings, vocoder=vocoder)
IPythonAudio(speech.cpu().numpy(), rate=16000, autoplay=True)
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- training_steps: 10000
- mixed_precision_training: Native AMP
Framework versions
- Transformers 4.46.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3
- Downloads last month
- 356
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for fahadqazi/Sindhi-TTS
Unable to build the model tree, the base model loops to the model itself. Learn more.