Angelectronic
commited on
Commit
•
ac28a29
1
Parent(s):
ed88b85
Update README.md
Browse files
README.md
CHANGED
@@ -29,9 +29,10 @@ This gemma model was trained 2x faster with [Unsloth](https://github.com/unsloth
|
|
29 |
The following hyperparameters were used during training:
|
30 |
- learning_rate: 0.0002
|
31 |
- train_batch_size: 16
|
32 |
-
- eval_batch_size:
|
33 |
- seed: 3407
|
34 |
- gradient_accumulation_steps: 4
|
|
|
35 |
- total_train_batch_size: 64
|
36 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
37 |
- lr_scheduler_type: cosine
|
|
|
29 |
The following hyperparameters were used during training:
|
30 |
- learning_rate: 0.0002
|
31 |
- train_batch_size: 16
|
32 |
+
- eval_batch_size: 8
|
33 |
- seed: 3407
|
34 |
- gradient_accumulation_steps: 4
|
35 |
+
- eval_accumulation_steps: 4
|
36 |
- total_train_batch_size: 64
|
37 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
38 |
- lr_scheduler_type: cosine
|