Visualize in Weights & Biases

gemma-2-9b-it-lora-commonsense

This model is a fine-tuned version of google/gemma-2-9b-it on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8229

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.0188 0.1503 200 0.9641
0.9971 0.3007 400 0.9404
0.9827 0.4510 600 0.9288
0.9748 0.6013 800 0.9194
0.971 0.7516 1000 0.9055
0.957 0.9020 1200 0.8970
0.9005 1.0523 1400 0.8874
0.8876 1.2026 1600 0.8748
0.8782 1.3529 1800 0.8640
0.8896 1.5033 2000 0.8489
0.8814 1.6536 2200 0.8417
0.8666 1.8039 2400 0.8325
0.8674 1.9542 2600 0.8307
0.8116 2.1046 2800 0.8366
0.8032 2.2549 3000 0.8291
0.8103 2.4052 3200 0.8265
0.8165 2.5556 3400 0.8245
0.8085 2.7059 3600 0.8242
0.8121 2.8562 3800 0.8229

Framework versions

  • Transformers 4.42.3
  • Pytorch 2.3.1+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
108M params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for yspkm/gemma-2-9b-it-lora-commonsense

Base model

google/gemma-2-9b
Finetuned
(85)
this model