mms-1b-toigen-balanced-model

This model is a fine-tuned version of facebook/mms-1b-all on the TOIGEN - TOI dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
7.7726	0.4464	100	3.8109	0.9938
2.5726	0.8929	200	0.8106	0.6167
0.7986	1.3393	300	0.5409	0.5258
0.6324	1.7857	400	0.5256	0.5054
0.603	2.2321	500	0.4854	0.4832
0.59	2.6786	600	0.4733	0.4846
0.5489	3.125	700	0.4440	0.4657
0.5173	3.5714	800	0.4322	0.4576
0.5315	4.0179	900	0.4286	0.4453
0.4912	4.4643	1000	0.4254	0.4458
0.4728	4.9107	1100	0.4346	0.4430
0.4989	5.3571	1200	0.4050	0.4292
0.4661	5.8036	1300	0.4019	0.4255
0.4755	6.25	1400	0.4129	0.4449
0.4603	6.6964	1500	0.4046	0.4255
0.4229	7.1429	1600	0.3939	0.4150
0.455	7.5893	1700	0.4133	0.4155
0.4501	8.0357	1800	0.3978	0.4065
0.45	8.4821	1900	0.3925	0.4231
0.4226	8.9286	2000	0.3901	0.4098
0.3973	9.375	2100	0.3810	0.4056
0.4038	9.8214	2200	0.4178	0.4117
0.4559	10.2679	2300	0.3875	0.4075
0.4399	10.7143	2400	0.3742	0.3990
0.3545	11.1607	2500	0.3818	0.4013
0.4452	11.6071	2600	0.3906	0.3980
0.4014	12.0536	2700	0.3752	0.3999