ul2-large-dutch-finetuned-oba-book-search

This model is a fine-tuned version of yhavinga/ul2-large-dutch on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.8688
Top-5-accuracy: 4.1194

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.6
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 1000
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Top-5-accuracy
6.4431	0.0424	500	4.7239	0.0796
6.4068	0.0848	1000	5.1338	0.0398
5.7971	0.1272	1500	4.6127	0.0199
5.452	0.1696	2000	4.5181	0.1194
5.3971	0.2120	2500	4.5498	0.1393
5.2693	0.2544	3000	4.3622	0.1393
5.2788	0.2968	3500	4.3456	0.1990
5.2129	0.3392	4000	4.3400	0.2388
5.133	0.3815	4500	4.3021	0.2786
5.0346	0.4239	5000	4.2458	0.9751
5.113	0.4663	5500	4.2746	0.7363
5.1276	0.5087	6000	4.2369	0.9552
5.0586	0.5511	6500	4.1962	1.8706
4.9369	0.5935	7000	4.1843	2.9254
4.9152	0.6359	7500	4.1641	3.0846
4.9369	0.6783	8000	4.1089	3.7413
4.9185	0.7207	8500	4.1150	3.6418
4.8469	0.7631	9000	4.0996	3.6418
4.8854	0.8055	9500	4.0817	3.5821
4.8362	0.8479	10000	4.0456	4.2587
4.7867	0.8903	10500	4.0699	3.9204
4.7926	0.9327	11000	4.0692	3.3831
4.7933	0.9751	11500	4.0356	3.1642
4.793	1.0175	12000	4.0607	2.6667
4.7664	1.0599	12500	4.0430	3.5622
4.7409	1.1023	13000	4.0239	3.8806
4.7558	1.1446	13500	4.0134	3.7413
4.7642	1.1870	14000	3.9884	3.9403
4.7298	1.2294	14500	4.0087	3.6219
4.7433	1.2718	15000	3.9809	4.0995
4.6858	1.3142	15500	3.9984	4.2985
4.7023	1.3566	16000	3.9655	4.0199
4.6963	1.3990	16500	3.9798	4.1791
4.7239	1.4414	17000	4.0001	4.0597
4.7312	1.4838	17500	3.9532	4.0796
4.6408	1.5262	18000	3.9487	4.2388
4.669	1.5686	18500	3.9303	4.1990
4.6589	1.6110	19000	3.9346	4.1393
4.6887	1.6534	19500	3.9563	3.9403
4.5856	1.6958	20000	3.9374	4.2786
4.6744	1.7382	20500	3.9157	4.0995
4.6395	1.7806	21000	3.9279	4.1393
4.6191	1.8230	21500	3.9259	3.8408
4.6256	1.8654	22000	3.9215	3.9005
4.5945	1.9077	22500	3.9214	4.0796
4.6325	1.9501	23000	3.9076	3.8607
4.6476	1.9925	23500	3.8955	4.0199
4.6362	2.0349	24000	3.8923	4.0398
4.5991	2.0773	24500	3.8923	4.3383
4.6189	2.1197	25000	3.8800	4.0
4.5933	2.1621	25500	3.8869	3.8806
4.6165	2.2045	26000	3.8918	4.0398
4.5998	2.2469	26500	3.8819	3.9602
4.5827	2.2893	27000	3.8848	3.9204
4.528	2.3317	27500	3.8847	3.9005
4.5685	2.3741	28000	3.8879	3.9204
4.5698	2.4165	28500	3.8739	3.9801
4.5472	2.4589	29000	3.8761	4.0398
4.5605	2.5013	29500	3.8753	4.0398
4.5329	2.5437	30000	3.8791	4.0796
4.5687	2.5861	30500	3.8698	4.0
4.5716	2.6285	31000	3.8659	4.0995
4.547	2.6708	31500	3.8713	4.0597
4.6466	2.7132	32000	3.8729	4.0995
4.5963	2.7556	32500	3.8698	4.1194
4.629	2.7980	33000	3.8703	4.1194
4.5859	2.8404	33500	3.8699	4.1194
4.6239	2.8828	34000	3.8688	4.1393
4.5052	2.9252	34500	3.8688	4.1393
4.5933	2.9676	35000	3.8688	4.1194

Framework versions

PEFT 0.11.0
Transformers 4.44.2
Pytorch 1.13.0+cu116
Datasets 3.0.0
Tokenizers 0.19.1

esahit
/

ul2-large-dutch-finetuned-oba-book-search

ul2-large-dutch-finetuned-oba-book-search

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for esahit/ul2-large-dutch-finetuned-oba-book-search

Evaluation results