mon_nllb_1.3B

This model is a fine-tuned version of facebook/nllb-200-distilled-1.3B on an unknown dataset. It achieves the following results on the evaluation set:

  • BLEU: 44.06
  • chrF: 44.43
  • METEOR: 0.537

Example Usage

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "Billyyy/mon_nllb_1.3B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

text = "Сайн байна уу"?"
inputs = tokenizer(text, return_tensors="pt")

output_tokens = model.generate(**inputs)
translated_text = tokenizer.decode(output_tokens[0], skip_special_tokens=True)

print(translated_text)

Model description

This model was finetuned on Mongolian->English parallel dataset with LoRA

Training and evaluation data

Training data:

Evaluation data:

  • FLORES-200

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 40
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 160
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 2
  • mixed_precision_training: FP16

Training results

Training Loss Epoch Step Validation Loss
7.3708 0.1522 1000 7.2420
7.25 0.3044 2000 7.2126
7.237 0.4567 3000 7.2120
7.2344 0.6089 4000 7.2137
7.2323 0.7611 5000 7.2130
7.2351 0.9133 6000 7.2121
7.222 1.0656 7000 7.2131
7.22 1.2178 8000 7.2122
7.2077 1.3700 9000 7.2131
7.2132 1.5223 10000 7.2132
7.2211 1.6745 11000 7.2128
7.2269 1.8267 12000 7.2131
7.2296 1.9789 13000 7.2132

Framework versions

  • PEFT 0.14.0
  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
174
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support translation models for peft library.

Model tree for Billyyy/mon_nllb_1.3B

Adapter
(4)
this model

Dataset used to train Billyyy/mon_nllb_1.3B

Evaluation results