enko_mbartLarge_36p_exp1

This model is a fine-tuned version of facebook/mbart-large-50-many-to-many-mmt on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2181
  • Bleu: 15.4063
  • Gen Len: 14.7808

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 15

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
1.4235 0.46 5000 1.3893 12.3168 14.6634
1.3281 0.93 10000 1.2917 14.3522 14.9186
1.2506 1.39 15000 1.2669 14.3525 14.9494
1.1603 1.86 20000 1.2283 15.248 15.0062
1.0765 2.32 25000 1.2181 15.4063 14.7808
1.1019 2.79 30000 1.2753 14.3608 14.9014
1.0504 3.25 35000 1.2334 15.3253 14.7948
0.9431 3.72 40000 1.2512 15.2534 14.7293
0.8394 4.18 45000 1.2971 14.9999 14.7993

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.6
  • Tokenizers 0.14.1
Downloads last month
19
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for yesj1234/enko_mbartLarge_36p_exp1

Finetuned
(111)
this model