cnn_dailymail-summarization-t5-small-2022-09-05

This model is a fine-tuned version of t5-small on the cnn_dailymail 3.0.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6455
  • Rouge1: 41.4235
  • Rouge2: 19.0263
  • Rougel: 29.2892
  • Rougelsum: 38.6338
  • Gen Len: 73.7273

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Gen Len Validation Loss Rouge1 Rouge2 Rougel Rougelsum
1.9623 0.03 1000 18.9996 1.7500 24.1039 11.368 19.813 22.671
1.8827 0.06 2000 18.9993 1.7382 24.1445 11.4497 19.8683 22.7376
1.8988 0.08 3000 18.9999 1.7310 24.329 11.5899 20.0409 22.9104
1.8778 0.11 4000 19.0 1.7177 24.3886 11.6472 20.1048 22.988
1.9173 0.14 5000 18.9996 1.7140 24.3508 11.5594 20.075 22.932
1.9009 0.17 6000 18.9995 1.7134 24.28 11.6075 20.0581 22.8833
1.8975 0.2 7000 18.9994 1.7081 24.3203 11.6175 20.035 22.9167
1.8835 0.22 8000 18.9996 1.7061 24.2729 11.6324 20.0728 22.8747
1.8725 0.25 9000 19.0 1.6995 24.2542 11.5763 20.0241 22.8713
1.837 0.28 10000 18.9997 1.6998 24.3321 11.599 20.1028 22.9562
1.8629 0.31 11000 19.0 1.6944 24.4161 11.6208 20.1374 23.024
1.85 0.33 12000 18.9990 1.7002 24.3514 11.6883 20.134 22.9515
1.8506 0.36 13000 18.9987 1.6894 24.3812 11.6592 20.1641 23.0108
1.8869 0.39 14000 18.9995 1.6881 24.3956 11.6817 20.1654 23.0284
1.8327 0.42 15000 18.9993 1.6903 24.3707 11.6446 20.1353 22.9801
1.8204 0.45 16000 18.9993 1.6896 24.3663 11.6963 20.1357 22.9898
1.8764 0.47 17000 18.9978 1.6846 24.4212 11.652 20.1455 23.0326
1.8213 0.5 18000 18.9992 1.6817 24.452 11.7014 20.1898 23.0668
1.8424 0.53 19000 18.9990 1.6844 24.4206 11.7049 20.1931 23.0358
1.8721 0.56 20000 18.9996 1.6814 24.4483 11.6789 20.1798 23.0508
1.87 0.59 21000 18.9996 1.6796 24.4799 11.6789 20.1919 23.0831
1.844 0.61 22000 18.9996 1.6770 24.4741 11.7433 20.2031 23.0535
1.8611 0.64 23000 18.9986 1.6785 24.4837 11.7572 20.219 23.088
1.8201 0.67 24000 18.9993 1.6796 24.3955 11.6978 20.173 23.0302
1.8506 0.7 25000 18.9995 1.6770 24.4084 11.711 20.1851 23.0266
1.846 0.72 26000 18.9990 1.6765 24.4272 11.6779 20.1785 23.0352
1.8431 0.75 27000 18.9998 1.6757 24.4484 11.7154 20.2156 23.0646
1.8208 0.78 28000 18.9993 1.6764 24.412 11.6887 20.1752 23.0151
1.8108 0.81 29000 18.9997 1.6733 24.4051 11.7155 20.1773 23.0215
1.847 0.84 30000 18.9994 1.6738 24.5531 11.7949 20.2834 23.1588
1.8386 0.86 31000 18.9991 1.6674 24.5155 11.7333 20.2529 23.145
1.82 0.89 32000 18.9988 1.6693 24.4498 11.7118 20.2183 23.0767
1.8475 0.92 33000 18.9993 1.6676 24.442 11.676 20.168 23.0409
1.7948 0.95 34000 18.9990 1.6689 24.4561 11.7865 20.2446 23.0707
1.8357 0.98 35000 18.9994 1.6757 24.4005 11.7299 20.1999 23.0093
1.8624 1.0 36000 18.9988 1.6745 24.3371 11.6749 20.1257 22.9428
1.8309 1.03 37000 18.9995 1.6675 24.5108 11.8038 20.2691 23.117
1.8237 1.06 38000 18.9996 1.6654 24.482 11.7485 20.2225 23.0917
1.7743 1.09 39000 18.9993 1.6681 24.5106 11.7511 20.2583 23.123
1.7811 1.11 40000 18.9991 1.6636 24.6194 11.843 20.3375 23.2259
1.7973 1.14 41000 19.0 1.6666 24.5434 11.8133 20.3033 23.165
1.8156 1.17 42000 18.9993 1.6660 24.4857 11.7526 20.2406 23.1081
1.8403 1.2 43000 18.9998 1.6621 24.4632 11.7525 20.2459 23.0692
1.8129 1.23 44000 18.9999 1.6643 24.6032 11.8251 20.3368 23.1806
1.7896 1.25 45000 18.9993 1.6622 24.4619 11.7769 20.2516 23.0647
1.7948 1.28 46000 18.9992 1.6608 24.5468 11.8041 20.2941 23.1551
1.8043 1.31 47000 18.9993 1.6614 24.5774 11.8246 20.3189 23.1836
1.7884 1.34 48000 18.9993 1.6581 24.5688 11.843 20.2993 23.1756
1.8041 1.37 49000 18.9996 1.6614 24.5454 11.8346 20.3179 23.1605
1.8192 1.39 50000 18.9998 1.6597 24.5017 11.7755 20.2439 23.1148
1.8679 1.42 51000 18.9995 1.6555 24.5302 11.7638 20.2592 23.1395
1.82 1.45 52000 18.9998 1.6571 24.546 11.7798 20.265 23.1408
1.8267 1.48 53000 18.9996 1.6552 24.5214 11.7368 20.2276 23.1504
1.8063 1.5 54000 18.9992 1.6588 24.5222 11.8209 20.2941 23.1551
1.8171 1.53 55000 18.9996 1.6569 24.5845 11.8182 20.3147 23.1812
1.7884 1.56 56000 18.9998 1.6597 24.532 11.8057 20.2622 23.1459
1.7588 1.59 57000 18.9994 1.6572 24.6532 11.8958 20.3877 23.2776
1.7847 1.62 58000 18.9996 1.6561 24.5483 11.856 20.3188 23.1852
1.8523 1.64 59000 18.9996 1.6584 24.5501 11.8666 20.3197 23.1683
1.7955 1.67 60000 18.9999 1.6546 24.5126 11.8043 20.2603 23.1175
1.8215 1.7 61000 18.9996 1.6541 24.5884 11.8003 20.2887 23.1866
1.7917 1.73 62000 18.9997 1.6568 24.619 11.8868 20.3496 23.2304
1.7543 1.76 63000 18.9996 1.6570 24.5378 11.8192 20.2681 23.1454
1.7978 1.78 64000 18.9999 1.6541 24.5719 11.8446 20.2873 23.1855
1.8228 1.81 65000 18.9998 1.6561 24.5193 11.8527 20.3185 23.1395
1.8163 1.84 66000 18.9998 1.6537 24.4385 11.7625 20.2042 23.0671
1.7868 1.87 67000 18.9998 1.6532 24.4985 11.8187 20.2775 23.1426
1.8345 1.89 68000 18.9999 1.6522 24.5375 11.8398 20.285 23.1643
1.7773 1.92 69000 18.9999 1.6529 24.4722 11.7979 20.2636 23.106
1.8409 1.95 70000 18.9999 1.6521 24.4845 11.8136 20.2557 23.1089
1.8146 1.98 71000 18.9999 1.6515 24.4923 11.7965 20.2521 23.1247
1.7466 2.01 72000 19.0 1.6526 24.4913 11.8254 20.2562 23.1266
1.8009 2.03 73000 19.0 1.6505 24.5231 11.8414 20.2842 23.1654
1.7768 2.06 74000 19.0 1.6516 24.5192 11.8206 20.2884 23.1493
1.7569 2.09 75000 19.0 1.6541 24.6135 11.9135 20.3513 23.2279
1.7893 2.12 76000 18.9997 1.6507 24.5934 11.8727 20.3305 23.2106
1.763 2.15 77000 18.9999 1.6512 24.5829 11.8543 20.3142 23.2049
1.7552 2.17 78000 18.9998 1.6506 24.5332 11.8309 20.2795 23.1654
1.7632 2.2 79000 18.9995 1.6498 24.5569 11.8313 20.3158 23.1808
1.8056 2.23 80000 18.9996 1.6488 24.6217 11.8877 20.3555 23.2514
1.8066 2.26 81000 18.9996 1.6494 24.5799 11.8515 20.3307 23.2059
1.7903 2.28 82000 18.9998 1.6487 24.6151 11.889 20.3739 23.2226
1.805 2.31 83000 18.9996 1.6493 24.5739 11.8659 20.3354 23.1884
1.7843 2.34 84000 18.9996 1.6487 24.6125 11.8879 20.3648 23.2274
1.8153 2.37 85000 18.9996 1.6493 24.5638 11.8392 20.3084 23.165
1.7581 2.4 86000 18.9996 1.6490 24.6121 11.8876 20.36 23.2163
1.6925 2.42 87000 18.9998 1.6502 24.6192 11.8992 20.3786 23.2421
1.7535 2.45 88000 18.9996 1.6473 24.6134 11.8877 20.3663 23.2262
1.751 2.48 89000 18.9996 1.6496 24.5728 11.8886 20.3411 23.1906
1.7577 2.51 90000 18.9996 1.6477 24.5616 11.8489 20.3021 23.1754
1.8 2.54 91000 18.9996 1.6473 24.5614 11.8663 20.3282 23.1868
1.7859 2.56 92000 18.9998 1.6483 24.5594 11.8426 20.3197 23.191
1.7984 2.59 93000 18.9998 1.6469 24.5732 11.8258 20.3204 23.1958
1.7943 2.62 94000 18.9996 1.6477 24.5888 11.8602 20.3352 23.2181
1.7888 2.65 95000 18.9996 1.6472 24.5781 11.844 20.3272 23.216
1.7803 2.67 96000 1.6483 24.5454 11.8245 20.2917 23.1727 18.9996
1.8106 2.7 97000 1.6461 24.5694 11.8344 20.3123 23.1934 18.9996
1.8713 2.73 98000 1.6454 24.5906 11.8573 20.3447 23.2181 18.9996
1.7655 2.76 99000 1.6468 24.5709 11.8573 20.3139 23.1994 18.9996
1.7616 2.79 100000 1.6464 24.5852 11.8531 20.3172 23.2089 18.9998
1.7581 2.81 101000 1.6468 24.5748 11.8452 20.3043 23.1849 18.9997
1.7743 2.84 102000 1.6462 24.5665 11.8328 20.2992 23.1896 18.9996
1.78 2.87 103000 1.6458 24.5716 11.8399 20.31 23.1943 18.9996
1.8162 2.9 104000 1.6456 24.5719 11.8358 20.3132 23.1921 18.9996
1.7862 2.93 105000 1.6462 24.5938 11.8624 20.337 23.2131 18.9996
1.7995 2.95 106000 1.6459 24.5885 11.8606 20.3325 23.2137 18.9996
1.7559 2.98 107000 1.6454 24.593 11.861 20.3401 23.2188 18.9996

Framework versions

  • Transformers 4.22.0.dev0
  • Pytorch 1.12.1+cu102
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train farleyknight/cnn_dailymail-summarization-t5-small-2022-09-05

Evaluation results