Edit model card

Model Overview

This model is a fine-tuned version of Microsoft's SpeechT5 text-to-speech model, adapted to handle technical terminology, abbreviations, and domain-specific jargon. It has been trained on a custom dataset containing 100 text entries that are highly focused on terms used in technical interviews and professional communication. The fine-tuning process ensures accurate pronunciation of technical terms, improving the quality of TTS outputs in scenarios requiring domain expertise.

Model Details

  • Base Model: microsoft/speecht5_tts
  • Language: Lithuanian (lt)
  • License: MIT
  • Dataset: Custom English texts, primarily focused on technical terminology commonly encountered in fields such as computer science, engineering, and software development.

Dataset

  • Text Data: Contains 100 text entries, each including technical terms, abbreviations, and industry-specific vocabulary.The text length varies from short sentences to longer technical descriptions.
  • Audio Data: Corresponding synthesized audio generated for each text entry.Audio is encoded in WAV format, sampled at 16 kHz, and designed for TTS applications.

Usage

from transformers import AutoTokenizer, AutoModelForSpeechT5

tokenizer = AutoTokenizer.from_pretrained("Arch10/SpeechT5_finetune_technical_terms")
model = AutoModelForSpeechT5.from_pretrained("Arch10/SpeechT5_finetune_technical_terms")
Downloads last month
9
Safetensors
Model size
144M params
Tensor type
F32
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for Arch10/SpeechT5_finetune_technical_terms

Finetuned
(763)
this model