Whisper Small Fine-tuned with Uyghur Common Voice

This model is a fine-tuned version of openai/whisper-small on the Uyghur Common Voice dataset.

This model achieves the following results on the evaluation set:

  • Loss: 1.5920
  • Wer Ortho: 42.9701
  • Wer: 28.2995
  • Cer: 10.8968

Training and evaluation data

The training was done using the combined train and dev set of common_voice_15_0 (11215 recordings, ~20hrs of audio).

The testing was done using the test set of THUYG20 as the standard benchmark for Uyghur speech models.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer Cer
0.574400 0.7133 500 1.413890 59.765522 48.561550 17.639905
0.299600 1.4256 1000 1.283326 52.819004 41.377838 14.717958
0.130600 2.1398 1500 1.379338 52.265742 38.953389 16.260934
0.122500 2.8531 2000 1.313730 50.245894 36.494793 14.762585
0.060500 3.5663 2500 1.434626 47.589356 32.998976 12.185938
0.019500 4.2796 3000 1.526625 45.345570 30.975756 11.307346
0.015300 4.9929 3500 1.531676 44.120488 29.285470 11.690021
0.003300 5.7061 4000 1.592020 42.970054 28.299471 10.896778

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
232
Safetensors
Model size
242M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ixxan/whisper-small-uyghur-common-voice

Finetuned
(2103)
this model

Space using ixxan/whisper-small-uyghur-common-voice 1

Evaluation results