collapse_gemma-2-2b_hs2_accumulatesubsample_iter17_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.2726	0.0535	5	1.2797	261768
1.1191	0.1070	10	1.2274	527656
0.9603	0.1605	15	1.2346	792592
0.7861	0.2140	20	1.2535	1060392
0.7055	0.2676	25	1.2497	1331816
0.6513	0.3211	30	1.2599	1600048
0.6785	0.3746	35	1.2513	1862592
0.5816	0.4281	40	1.2579	2132648
0.5033	0.4816	45	1.2418	2397080
0.4926	0.5351	50	1.2292	2665584
0.5115	0.5886	55	1.2360	2939440
0.395	0.6421	60	1.2264	3206336
0.4836	0.6957	65	1.2312	3475784
0.4008	0.7492	70	1.2145	3740448
0.4104	0.8027	75	1.2251	4008264
0.4466	0.8562	80	1.2196	4277008
0.3173	0.9097	85	1.2176	4540200
0.4054	0.9632	90	1.2160	4799696