Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. It achieves the following results on the evaluation set:

Loss: 1.3291
Bleu: 33.46
Chrf: 52.93
Wer: 61.7740

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.03
training_steps: 9000
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Bleu	Chrf	Validation Loss	Wer
2.4998	0.0236	100	4.24	19.77	2.0245	123.5029
2.5999	0.0472	200	5.55	23.63	2.0729	130.1666
2.4062	0.0708	300	5.92	24.15	1.9928	157.4966
2.1866	0.0944	400	12.74	30.47	1.8337	93.4714
2.2485	0.1180	500	10.32	30.65	1.8209	116.4791
2.1521	0.1416	600	9.84	30.97	1.7512	130.1666
1.9324	0.1653	700	17.24	34.37	1.7362	85.4570
1.9703	0.1889	800	13.05	32.27	1.6784	105.7632
1.7299	0.2125	900	9.81	31.71	1.6530	131.6974
1.7822	0.2361	1000	11.72	32.5	1.5541	125.7091
1.5493	0.2597	1100	15.04	36.72	1.5773	92.4358
1.4813	0.2833	1200	22.08	40.11	1.5341	75.8667
1.5285	0.3069	1300	18.88	40.93	1.4834	95.4975
1.3255	0.3305	1400	20.11	40.82	1.4956	85.2319
1.3931	0.3541	1500	22.81	41.51	1.4718	72.2197
1.3962	0.3777	1600	25.43	43.53	1.3794	71.1842
1.1412	0.4013	1700	22.13	43.19	1.4172	86.9428
1.1132	0.4249	1800	21.27	42.45	1.3989	81.0896
0.9261	0.4485	1900	26.39	45.4	1.4147	70.6889
0.994	0.4721	2000	24.38	42.87	1.4365	77.5326
0.8598	0.4958	2100	19.36	41.49	1.3559	96.6231
0.7784	0.5194	2200	26.54	45.57	1.3550	69.5633
0.7858	0.5430	2300	27.52	47.58	1.3156	68.8879
0.7715	0.5666	2400	26.12	46.47	1.2985	72.5349
0.7079	0.5902	2500	25.62	47.61	1.3134	68.6177
0.6704	0.6138	2600	28.2	47.37	1.3047	69.1130
0.6579	0.6374	2700	29.52	49.39	1.2486	68.2125
0.502	0.6610	2800	28.08	48.99	1.2511	68.6177
0.4442	0.6846	2900	32.57	50.66	1.2800	63.3498
0.5175	0.7082	3000	29.69	48.77	1.2650	66.2314
0.4416	0.7318	3100	32.36	50.29	1.2554	61.9090
0.4529	0.7554	3200	32.6	50.94	1.2050	61.5489
0.4435	0.7790	3300	33.2	52.17	1.2103	61.3688
0.3724	0.8026	3400	33.89	52.88	1.1756	59.8379
0.3883	0.8263	3500	32.21	51.86	1.1979	62.0891
0.3534	0.8499	3600	32.75	51.85	1.1943	61.2337
0.326	0.8735	3700	32.43	51.5	1.1891	62.1342
0.305	0.8971	3800	33.43	51.45	1.1858	59.4327
0.2258	0.9207	3900	32.53	51.42	1.1827	61.1887
0.3104	0.9443	4000	32.1	51.33	1.1857	61.2337
0.3847	0.9679	4100	1.3506	29.91	48.63	66.5466
0.426	0.9915	4200	1.3458	25.68	45.27	70.1036
0.2622	1.0151	4300	1.3544	27.52	48.0	66.4115
0.2429	1.0387	4400	1.4330	22.57	45.45	79.9190
0.269	1.0623	4500	1.4399	24.7	45.73	74.7411
0.3171	1.0859	4600	1.3711	29.55	47.78	68.4827
0.2321	1.1095	4700	1.4350	24.73	45.52	77.1724
0.2595	1.1331	4800	1.3851	30.54	47.85	65.1508
0.2426	1.1568	4900	1.4109	28.87	47.5	68.3926
0.2496	1.1804	5000	1.3717	29.97	48.74	68.6628
0.2551	1.2040	5100	1.4157	29.92	47.59	66.3215
0.231	1.2276	5200	1.3908	28.97	47.9	66.0063
0.245	1.2512	5300	1.4082	30.22	47.71	63.7100
0.284	1.2748	5400	1.3696	27.47	48.31	70.7789
0.2284	1.2984	5500	1.4044	27.63	47.37	68.2575
0.2457	1.3220	5600	1.3722	31.38	48.8	64.7906
0.2346	1.3456	5700	1.3397	33.61	50.14	60.3332
0.2088	1.3692	5800	1.3920	30.84	48.51	65.4660
0.1832	1.3928	5900	1.3892	31.47	49.56	64.5205
0.2171	1.4164	6000	1.3606	32.51	49.8	63.1697
0.1799	1.4400	6100	1.4130	30.8	50.05	63.3949
0.1756	1.4636	6200	1.3458	30.25	50.16	66.1864
0.1617	1.4873	6300	1.3971	32.27	50.74	63.4849
0.1909	1.5109	6400	1.4275	27.41	47.04	72.0396
0.1516	1.5345	6500	1.3591	30.1	49.05	66.0513
0.1892	1.5581	6600	1.3646	31.72	48.17	62.6294
0.2086	1.5817	6700	1.3314	28.85	49.68	67.3120
0.1253	1.6053	6800	1.3461	29.84	49.13	66.5466
0.1307	1.6289	6900	1.3671	29.39	48.77	67.7172
0.1376	1.6525	7000	1.3769	31.27	47.97	66.5916
0.1593	1.6761	7100	1.3699	30.53	49.33	65.4660
0.1604	1.6997	7200	1.3540	31.99	48.93	63.8001
0.118	1.7233	7300	1.3523	29.52	49.26	67.5822
0.1148	1.7469	7400	1.3130	31.49	49.49	62.8996
0.0946	1.7705	7500	1.3468	32.6	49.76	63.1697
0.0891	1.7941	7600	1.3268	31.84	50.41	63.5750
0.103	1.8178	7700	1.3243	32.81	50.61	60.3782
0.1016	1.8414	7800	1.2945	33.07	53.14	61.0086
0.1014	1.8650	7900	1.3163	32.35	51.28	63.3498
0.1257	1.8886	8000	1.3246	31.65	51.86	61.7740
0.0859	1.9122	8100	1.3247	30.69	51.47	64.4304
0.0943	1.9358	8200	1.3030	33.06	52.31	61.6389
0.11	1.9594	8300	1.2866	33.32	52.83	60.1081
0.0723	1.9830	8400	1.3071	32.96	51.64	61.7740
0.0312	2.0066	8500	1.3202	33.2	52.78	62.0891
0.0303	2.0302	8600	1.3348	33.24	52.75	62.4043
0.02	2.0538	8700	1.3447	33.32	52.6	62.0891
0.0329	2.0774	8800	1.3328	34.04	52.93	60.7384
0.0216	2.1010	8900	1.3266	33.47	52.75	61.3237
0.0224	2.1246	9000	1.3291	33.46	52.93	61.7740

Framework versions

Transformers 4.41.2
Pytorch 2.2.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

ymoslem
/

whisper-medium-ga2en-v7.3.1-9k-r

Whisper Medium GA-EN Speech Translation

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ymoslem/whisper-medium-ga2en-v7.3.1-9k-r

Datasets used to train ymoslem/whisper-medium-ga2en-v7.3.1-9k-r

Evaluation results