Swahili female voice text-to-speech model
This is a continuous development of text-to-speech model for female voice using Swahili language
Please give it a try
for inference try the following
# import all required libraries
from transformers import VitsModel, AutoTokenizer
import torch
import numpy as np
import scipy.io.wavfile
# Load model and tokenizer
model = VitsModel.from_pretrained("mussacharles60/swahili-tts-female-voice")
tokenizer = AutoTokenizer.from_pretrained("mussacharles60/swahili-tts-female-voice")
# Running the TTS
text = "Mambo vipi ?, Hii ni Myssa Tech sauti ya A.I, kujaribishwa na Mussa Charles"
inputs = tokenizer(text, return_tensors="pt")
# Generate waveform
with torch.no_grad():
output = model(**inputs).waveform
# Convert PyTorch tensor to NumPy array
output_np = output.squeeze().cpu().numpy()
# Write to WAV file
scipy.io.wavfile.write("female_voice_test.wav", rate=model.config.sampling_rate, data=output_np)
You're all welcome to contribute.
Thanks 🤗
- Downloads last month
- 35
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for mussacharles60/swahili-tts-female-voice
Base model
facebook/mms-tts