git-base-pokemon

This model is a fine-tuned version of microsoft/git-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0420
  • Wer Score: 3.8081

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Score
5.2468 3.5398 50 4.5809 17.1528
0.7758 7.0796 100 0.4672 7.7810
0.0337 10.6195 150 0.0424 2.3531
0.0089 14.1593 200 0.0401 3.3039
0.0022 17.6991 250 0.0388 5.8557
0.0008 21.2389 300 0.0411 4.6740
0.0004 24.7788 350 0.0410 3.8676
0.0003 28.3186 400 0.0409 4.1766
0.0002 31.8584 450 0.0414 4.0136
0.0002 35.3982 500 0.0414 3.9779
0.0002 38.9381 550 0.0417 3.9542
0.0002 42.4779 600 0.0418 3.8913
0.0002 46.0177 650 0.0420 3.8183
0.0002 49.5575 700 0.0420 3.8081

Framework versions

  • Transformers 4.46.3
  • Pytorch 2.5.1+cu121
  • Datasets 3.2.0
  • Tokenizers 0.20.3
Downloads last month
14
Safetensors
Model size
177M params
Tensor type
F32
·
Inference API
Inference API (serverless) does not yet support transformers models for this pipeline type.

Model tree for osmanh/git-base-pokemon

Base model

microsoft/git-base
Finetuned
(106)
this model