Edit model card
YAML Metadata Error: "base_model" with value "/exports/eddie/scratch/s1970716/models/summarization/longt5_xl_gov_memsum_bp_5/checkpoint-1360" is not valid. Use a model id from https://hf.co/models.

longt5_xl_gov_memsum_bp_10

This model is a fine-tuned version of /exports/eddie/scratch/s1970716/models/summarization/longt5_xl_gov_memsum_bp_5/checkpoint-1360 on the learn3r/gov_report_memsum_bp dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1440
  • Rouge1: 67.1508
  • Rouge2: 38.9594
  • Rougel: 36.6571
  • Rougelsum: 64.7944
  • Gen Len: 712.0165

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 64
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.4356 1.0 272 1.1440 67.1508 38.9594 36.6571 64.7944 712.0165
0.3485 2.0 545 1.2622 66.9296 38.7595 36.4964 64.6309 808.4393
0.2933 3.0 818 1.3804 65.3911 38.0875 35.5935 63.1902 976.0340
0.2443 4.0 1091 1.4998 67.0929 38.0763 35.8704 64.9022 816.7253
0.2033 4.99 1360 1.5508 64.1251 36.9601 34.8068 61.9673 1052.8817

Framework versions

  • Transformers 4.34.1
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1
Downloads last month
0
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train learn3r/longt5_xl_gov_memsum_bp_10

Evaluation results