Edit model card

my_awesome_opus_books_model

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0129
  • Bleu: 100.0
  • Gen Len: 13.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 2 6.1851 0.0 12.0
No log 2.0 4 6.0627 0.0 12.0
No log 3.0 6 6.0627 0.0 12.0
No log 4.0 8 5.8390 0.0 12.0
No log 5.0 10 5.6596 0.0 12.0
No log 6.0 12 5.4424 0.0 12.0
No log 7.0 14 5.2815 0.0 12.0
No log 8.0 16 5.1388 0.0 12.0
No log 9.0 18 4.9987 0.0 12.0
No log 10.0 20 4.8170 0.0 12.0
No log 11.0 22 4.6353 0.0 12.0
No log 12.0 24 4.4854 0.0 12.0
No log 13.0 26 4.3039 0.0 12.0
No log 14.0 28 4.1131 0.0 12.0
No log 15.0 30 3.9509 0.0 12.0
No log 16.0 32 3.7972 0.0 12.0
No log 17.0 34 3.6481 5.5224 15.0
No log 18.0 36 3.5111 5.5224 15.0
No log 19.0 38 3.3768 5.5224 15.0
No log 20.0 40 3.2433 5.5224 15.0
No log 21.0 42 3.1126 5.5224 15.0
No log 22.0 44 3.0030 5.5224 15.0
No log 23.0 46 2.8871 5.5224 15.0
No log 24.0 48 2.7639 5.5224 15.0
No log 25.0 50 2.6478 5.5224 15.0
No log 26.0 52 2.5302 5.5224 15.0
No log 27.0 54 2.4243 5.5224 15.0
No log 28.0 56 2.3275 5.5224 15.0
No log 29.0 58 2.2400 5.5224 15.0
No log 30.0 60 2.1625 5.5224 15.0
No log 31.0 62 2.0853 5.5224 15.0
No log 32.0 64 2.0021 5.5224 14.0
No log 33.0 66 1.9144 5.5224 14.0
No log 34.0 68 1.8281 5.5224 14.0
No log 35.0 70 1.7493 5.5224 14.0
No log 36.0 72 1.6698 5.5224 14.0
No log 37.0 74 1.5966 5.5224 14.0
No log 38.0 76 1.5277 5.5224 14.0
No log 39.0 78 1.4569 5.5224 14.0
No log 40.0 80 1.3870 5.5224 14.0
No log 41.0 82 1.3169 6.5673 12.0
No log 42.0 84 1.2468 6.5673 12.0
No log 43.0 86 1.1823 6.5673 12.0
No log 44.0 88 1.1232 6.5673 12.0
No log 45.0 90 1.0667 6.5673 12.0
No log 46.0 92 1.0127 6.5673 12.0
No log 47.0 94 0.9854 6.5673 12.0
No log 48.0 96 0.9303 6.5673 12.0
No log 49.0 98 0.8819 0.0 19.0
No log 50.0 100 0.8386 0.0 19.0
No log 51.0 102 0.7923 0.0 19.0
No log 52.0 104 0.7454 0.0 19.0
No log 53.0 106 0.7012 100.0 13.0
No log 54.0 108 0.6630 100.0 13.0
No log 55.0 110 0.6287 100.0 13.0
No log 56.0 112 0.5939 100.0 13.0
No log 57.0 114 0.5608 100.0 13.0
No log 58.0 116 0.5308 100.0 13.0
No log 59.0 118 0.5019 100.0 13.0
No log 60.0 120 0.4757 100.0 13.0
No log 61.0 122 0.4503 100.0 13.0
No log 62.0 124 0.4254 100.0 13.0
No log 63.0 126 0.4007 100.0 13.0
No log 64.0 128 0.3801 100.0 13.0
No log 65.0 130 0.3607 100.0 13.0
No log 66.0 132 0.3438 100.0 13.0
No log 67.0 134 0.3276 100.0 13.0
No log 68.0 136 0.3132 100.0 13.0
No log 69.0 138 0.3000 100.0 13.0
No log 70.0 140 0.2872 100.0 13.0
No log 71.0 142 0.2747 100.0 13.0
No log 72.0 144 0.2633 100.0 13.0
No log 73.0 146 0.2537 100.0 13.0
No log 74.0 148 0.2453 100.0 13.0
No log 75.0 150 0.2377 100.0 13.0
No log 76.0 152 0.2303 100.0 13.0
No log 77.0 154 0.2222 100.0 13.0
No log 78.0 156 0.2141 100.0 13.0
No log 79.0 158 0.2066 100.0 13.0
No log 80.0 160 0.1987 100.0 13.0
No log 81.0 162 0.1919 100.0 13.0
No log 82.0 164 0.1857 100.0 13.0
No log 83.0 166 0.1798 100.0 13.0
No log 84.0 168 0.1742 100.0 13.0
No log 85.0 170 0.1687 100.0 13.0
No log 86.0 172 0.1633 100.0 13.0
No log 87.0 174 0.1577 100.0 13.0
No log 88.0 176 0.1526 100.0 13.0
No log 89.0 178 0.1477 100.0 13.0
No log 90.0 180 0.1429 100.0 13.0
No log 91.0 182 0.1380 100.0 13.0
No log 92.0 184 0.1334 100.0 13.0
No log 93.0 186 0.1281 100.0 13.0
No log 94.0 188 0.1230 100.0 13.0
No log 95.0 190 0.1180 100.0 13.0
No log 96.0 192 0.1136 100.0 13.0
No log 97.0 194 0.1093 100.0 13.0
No log 98.0 196 0.1050 100.0 13.0
No log 99.0 198 0.1013 100.0 13.0
No log 100.0 200 0.0979 100.0 13.0
No log 101.0 202 0.0953 100.0 13.0
No log 102.0 204 0.0931 100.0 13.0
No log 103.0 206 0.0907 100.0 13.0
No log 104.0 208 0.0887 100.0 13.0
No log 105.0 210 0.0866 100.0 13.0
No log 106.0 212 0.0844 100.0 13.0
No log 107.0 214 0.0822 100.0 13.0
No log 108.0 216 0.0795 100.0 13.0
No log 109.0 218 0.0768 100.0 13.0
No log 110.0 220 0.0743 100.0 13.0
No log 111.0 222 0.0715 100.0 13.0
No log 112.0 224 0.0687 100.0 13.0
No log 113.0 226 0.0663 100.0 13.0
No log 114.0 228 0.0641 100.0 13.0
No log 115.0 230 0.0620 100.0 13.0
No log 116.0 232 0.0598 100.0 13.0
No log 117.0 234 0.0577 100.0 13.0
No log 118.0 236 0.0557 100.0 13.0
No log 119.0 238 0.0541 100.0 13.0
No log 120.0 240 0.0523 100.0 13.0
No log 121.0 242 0.0506 100.0 13.0
No log 122.0 244 0.0489 100.0 13.0
No log 123.0 246 0.0472 100.0 13.0
No log 124.0 248 0.0456 100.0 13.0
No log 125.0 250 0.0441 100.0 13.0
No log 126.0 252 0.0429 100.0 13.0
No log 127.0 254 0.0416 100.0 13.0
No log 128.0 256 0.0405 100.0 13.0
No log 129.0 258 0.0393 100.0 13.0
No log 130.0 260 0.0382 100.0 13.0
No log 131.0 262 0.0370 100.0 13.0
No log 132.0 264 0.0357 100.0 13.0
No log 133.0 266 0.0345 100.0 13.0
No log 134.0 268 0.0332 100.0 13.0
No log 135.0 270 0.0322 100.0 13.0
No log 136.0 272 0.0311 100.0 13.0
No log 137.0 274 0.0303 100.0 13.0
No log 138.0 276 0.0295 100.0 13.0
No log 139.0 278 0.0288 100.0 13.0
No log 140.0 280 0.0282 100.0 13.0
No log 141.0 282 0.0275 100.0 13.0
No log 142.0 284 0.0267 100.0 13.0
No log 143.0 286 0.0261 100.0 13.0
No log 144.0 288 0.0254 100.0 13.0
No log 145.0 290 0.0249 100.0 13.0
No log 146.0 292 0.0243 100.0 13.0
No log 147.0 294 0.0238 100.0 13.0
No log 148.0 296 0.0233 100.0 13.0
No log 149.0 298 0.0229 100.0 13.0
No log 150.0 300 0.0225 100.0 13.0
No log 151.0 302 0.0222 100.0 13.0
No log 152.0 304 0.0218 100.0 13.0
No log 153.0 306 0.0215 100.0 13.0
No log 154.0 308 0.0212 100.0 13.0
No log 155.0 310 0.0210 100.0 13.0
No log 156.0 312 0.0208 100.0 13.0
No log 157.0 314 0.0205 100.0 13.0
No log 158.0 316 0.0202 100.0 13.0
No log 159.0 318 0.0200 100.0 13.0
No log 160.0 320 0.0197 100.0 13.0
No log 161.0 322 0.0194 100.0 13.0
No log 162.0 324 0.0191 100.0 13.0
No log 163.0 326 0.0188 100.0 13.0
No log 164.0 328 0.0185 100.0 13.0
No log 165.0 330 0.0181 100.0 13.0
No log 166.0 332 0.0178 100.0 13.0
No log 167.0 334 0.0175 100.0 13.0
No log 168.0 336 0.0171 100.0 13.0
No log 169.0 338 0.0168 100.0 13.0
No log 170.0 340 0.0166 100.0 13.0
No log 171.0 342 0.0162 100.0 13.0
No log 172.0 344 0.0162 100.0 13.0
No log 173.0 346 0.0159 100.0 13.0
No log 174.0 348 0.0158 100.0 13.0
No log 175.0 350 0.0156 100.0 13.0
No log 176.0 352 0.0154 100.0 13.0
No log 177.0 354 0.0152 100.0 13.0
No log 178.0 356 0.0151 100.0 13.0
No log 179.0 358 0.0149 100.0 13.0
No log 180.0 360 0.0147 100.0 13.0
No log 181.0 362 0.0146 100.0 13.0
No log 182.0 364 0.0144 100.0 13.0
No log 183.0 366 0.0144 100.0 13.0
No log 184.0 368 0.0142 100.0 13.0
No log 185.0 370 0.0141 100.0 13.0
No log 186.0 372 0.0140 100.0 13.0
No log 187.0 374 0.0139 100.0 13.0
No log 188.0 376 0.0137 100.0 13.0
No log 189.0 378 0.0136 100.0 13.0
No log 190.0 380 0.0135 100.0 13.0
No log 191.0 382 0.0134 100.0 13.0
No log 192.0 384 0.0134 100.0 13.0
No log 193.0 386 0.0133 100.0 13.0
No log 194.0 388 0.0132 100.0 13.0
No log 195.0 390 0.0132 100.0 13.0
No log 196.0 392 0.0131 100.0 13.0
No log 197.0 394 0.0131 100.0 13.0
No log 198.0 396 0.0130 100.0 13.0
No log 199.0 398 0.0130 100.0 13.0
No log 200.0 400 0.0129 100.0 13.0

Framework versions

  • Transformers 4.36.1
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
4
Safetensors
Model size
60.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for daemonkiller/my_awesome_opus_books_model

Base model

google-t5/t5-small
Finetuned
(1512)
this model