mon_nllb_3B_r32

This model is a fine-tuned version of facebook/nllb-200-distilled-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.2132

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 40
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 160
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
7.4511 0.0761 500 7.2785
7.3373 0.1522 1000 7.2305
7.2568 0.2283 1500 7.2138
7.2365 0.3044 2000 7.2126
7.2619 0.3805 2500 7.2130
7.2272 0.4567 3000 7.2117
7.2336 0.5328 3500 7.2137
7.2263 0.6089 4000 7.2139
7.2321 0.6850 4500 7.2129
7.2257 0.7611 5000 7.2124
7.2248 0.8372 5500 7.2121
7.2289 0.9133 6000 7.2121
7.2144 0.9894 6500 7.2131
7.2155 1.0656 7000 7.2133
7.215 1.1417 7500 7.2130
7.2146 1.2178 8000 7.2122
7.1995 1.2939 8500 7.2126
7.2025 1.3700 9000 7.2136
7.2302 1.4462 9500 7.2128
7.2078 1.5223 10000 7.2133
7.2063 1.5984 10500 7.2133
7.216 1.6745 11000 7.2128
7.1949 1.7506 11500 7.2132
7.2213 1.8267 12000 7.2131
7.2236 1.9028 12500 7.2132
7.2244 1.9789 13000 7.2132

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
72
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for Billyyy/mon_nllb_3B_r32

Adapter
(4)
this model