Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1067
  • Bleu: 32.0
  • Chrf: 52.48
  • Wer: 66.7717

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 8000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.5219 0.0138 100 0.44 10.48 2.1106 107.2490
2.4608 0.0276 200 3.3 20.43 2.1816 179.1535
2.3008 0.0414 300 3.66 21.59 2.0587 206.4836
2.2095 0.0552 400 8.79 27.66 1.9459 100.3602
2.0454 0.0690 500 8.14 27.36 1.8681 122.1522
1.9937 0.0828 600 11.05 30.26 1.8717 97.2535
1.868 0.0966 700 9.14 29.03 1.7917 129.0410
1.9924 0.1103 800 12.62 33.2 1.7170 89.6443
1.8646 0.1241 900 11.98 30.77 1.7252 97.8838
1.7644 0.1379 1000 10.87 31.0 1.6832 109.1851
1.692 0.1517 1100 13.05 34.46 1.6837 93.3814
1.7044 0.1655 1200 20.95 37.42 1.5527 75.2364
1.6824 0.1793 1300 14.91 35.56 1.5611 92.6159
1.6557 0.1931 1400 14.0 36.54 1.5554 99.8199
1.5456 0.2069 1500 19.72 39.81 1.5058 83.5660
1.3755 0.2207 1600 18.04 37.95 1.5039 82.9806
1.3959 0.2345 1700 17.01 39.5 1.4374 85.2319
1.5012 0.2483 1800 14.93 39.24 1.4242 114.4079
1.4278 0.2621 1900 23.85 42.69 1.3904 73.0302
1.3285 0.2759 2000 17.7 37.23 1.4493 83.8811
1.2655 0.2897 2100 20.1 40.32 1.3661 79.7839
1.2074 0.3034 2200 24.45 43.79 1.3387 72.9851
1.1893 0.3172 2300 21.45 42.61 1.3308 82.3953
1.1236 0.3310 2400 22.77 44.17 1.3050 77.3075
1.0934 0.3448 2500 25.54 46.32 1.2793 72.2647
1.06 0.3586 2600 28.27 47.32 1.2396 65.6911
1.0327 0.3724 2700 28.45 47.01 1.2577 67.3570
1.1623 0.3862 2800 24.54 47.43 1.2194 73.6155
1.0215 0.4 2900 27.4 49.6 1.2039 69.2481
0.9185 0.4138 3000 27.04 49.24 1.1724 67.8973
0.9003 0.4276 3100 31.08 50.11 1.1674 63.8001
0.9839 0.4414 3200 30.24 50.63 1.1580 64.5655
0.9396 0.4552 3300 30.79 51.72 1.1202 64.9257
0.9051 0.4690 3400 30.34 53.08 1.1180 66.4566
0.8621 0.4828 3500 33.3 53.86 1.1042 60.7834
0.8236 0.4966 3600 32.77 53.21 1.1070 62.0441
0.829 0.5103 3700 32.49 54.21 1.0771 62.5844
0.8375 0.5241 3800 32.27 53.98 1.0780 63.0797
0.8206 0.5379 3900 33.26 55.07 1.0615 61.6389
0.8059 0.5517 4000 33.24 55.16 1.0552 61.5038
0.9133 0.5655 4100 1.2218 29.38 49.22 66.0964
1.051 0.5793 4200 1.2304 25.12 46.01 71.8145
0.954 0.5931 4300 1.2501 25.47 45.88 75.3715
0.939 0.6069 4400 1.2204 29.19 47.63 66.9068
0.9887 0.6207 4500 1.2099 27.99 47.01 67.7172
1.0044 0.6345 4600 1.2080 23.77 45.33 73.3904
0.9881 0.6483 4700 1.2188 26.46 47.36 68.5277
0.9674 0.6621 4800 1.2296 26.11 45.92 68.3026
0.8845 0.6759 4900 1.2347 27.3 46.08 68.0324
0.8297 0.6897 5000 1.2108 29.48 48.96 64.6105
0.9065 0.7034 5100 1.1873 29.81 49.94 64.2503
0.8096 0.7172 5200 1.2122 28.5 46.93 66.2314
0.8077 0.7310 5300 1.1945 29.26 48.21 64.4755
0.8227 0.7448 5400 1.2310 26.82 48.43 71.4093
0.7587 0.7586 5500 1.2067 29.45 49.03 65.3309
0.7206 0.7724 5600 1.2114 29.89 49.33 65.5561
0.8088 0.7862 5700 1.1689 31.88 51.4 64.2954
0.693 0.8 5800 1.1644 27.23 48.11 68.7078
0.7099 0.8138 5900 1.1852 31.01 49.42 63.3949
0.7564 0.8276 6000 1.1554 28.3 50.34 71.0941
0.584 0.8414 6100 1.1566 34.79 51.69 59.0725
0.6817 0.8552 6200 1.1245 34.08 51.95 59.8829
0.5968 0.8690 6300 1.1475 32.4 51.59 62.9896
0.6092 0.8828 6400 1.1250 32.83 52.82 62.5844
0.6325 0.8966 6500 1.1108 29.29 51.68 69.1130
0.6002 0.9103 6600 1.0993 27.64 52.7 71.0941
0.6247 0.9241 6700 1.0898 28.39 52.4 68.3026
0.6257 0.9379 6800 1.0863 28.54 52.33 70.9140
0.6719 0.9517 6900 1.0891 31.43 53.53 66.1414
0.4994 0.9655 7000 1.1066 33.81 52.77 61.0986
0.5469 0.9793 7100 1.0891 30.52 53.13 67.3570
0.6031 0.9931 7200 1.0933 33.16 54.03 62.1792
0.2469 1.0069 7300 1.1426 33.76 52.38 62.8546
0.2572 1.0207 7400 1.1292 33.16 51.71 64.8807
0.2762 1.0345 7500 1.1090 34.76 54.28 60.7384
0.2332 1.0483 7600 1.1073 30.95 52.28 66.1864
0.2069 1.0621 7700 1.0999 32.39 53.08 65.5561
0.2417 1.0759 7800 1.1008 31.3 53.87 65.1058
0.2403 1.0897 7900 1.1053 32.18 53.3 66.4566
0.208 1.1034 8000 1.1067 32.0 52.48 66.7717

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
20
Safetensors
Model size
764M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for ymoslem/whisper-medium-ga2en-v6.3.1-8k-r

Finetuned
(498)
this model

Datasets used to train ymoslem/whisper-medium-ga2en-v6.3.1-8k-r

Collection including ymoslem/whisper-medium-ga2en-v6.3.1-8k-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    32.000
  • Wer on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    66.772