Whisper Tiny Taiwanese Simulated Android

This model is a fine-tuned version of openai/whisper-tiny on the TAT ASR Aligned dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7397
  • Cer: 11.2806

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1362
  • training_steps: 13620
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer
0.3641 0.9985 681 0.4668 19.0185
0.2569 1.9971 1362 0.4366 14.5059
0.1682 2.9956 2043 0.4342 13.5919
0.1095 3.9941 2724 0.4588 13.0167
0.0693 4.9927 3405 0.4854 12.6401
0.0455 5.9912 4086 0.5303 13.1776
0.0323 6.9897 4767 0.5626 12.8424
0.0228 7.9883 5448 0.5940 12.4495
0.0168 8.9868 6129 0.6214 12.4219
0.0124 9.9853 6810 0.6661 13.1648
0.0091 10.9839 7491 0.6534 12.1909
0.0067 11.9824 8172 0.6671 12.1441
0.0036 12.9809 8853 0.6948 12.0141
0.0016 13.9795 9534 0.6962 11.7995
0.0011 14.9780 10215 0.7180 11.6767
0.0008 15.9765 10896 0.7170 11.5896
0.0005 16.9751 11577 0.7260 11.5133
0.0002 17.9736 12258 0.7299 11.3793
0.0002 18.9721 12939 0.7373 11.2399
0.0001 19.9707 13620 0.7397 11.2806

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
10
Safetensors
Model size
37.8M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for jethrowang/android_emb-whisper-tiny

Finetuned
(1256)
this model