mt5-base-finetuned-novel-chinese-to-spanish
This model is a fine-tuned version of quickman/mt5-base-finetuned-chinese-to-spanish-finetuned-chinese-to-spanish on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.3193
- Score: 0.0000
- Counts: [545, 246, 135, 80]
- Totals: [777, 713, 649, 585]
- Precisions: [70.14157014157014, 34.50210378681627, 20.801232665639446, 13.675213675213675]
- Bp: 0.0000
- Sys Len: 777
- Ref Len: 17012
- Bleu: 0.0000
- Gen Len: 19.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 40
- training_steps: 10000
Training results
Training Loss | Epoch | Step | Validation Loss | Score | Counts | Totals | Precisions | Bp | Sys Len | Ref Len | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2.7861 | 0.6 | 500 | 1.9548 | 0.0000 | [465, 147, 51, 23] | [754, 690, 626, 562] | [61.6710875331565, 21.304347826086957, 8.146964856230031, 4.092526690391459] | 0.0000 | 754 | 17012 | 0.0000 | 19.0 |
2.5103 | 1.19 | 1000 | 1.7626 | 0.0000 | [491, 174, 62, 24] | [770, 706, 642, 578] | [63.76623376623377, 24.64589235127479, 9.657320872274143, 4.1522491349480966] | 0.0000 | 770 | 17012 | 0.0000 | 19.0 |
2.3148 | 1.79 | 1500 | 1.6428 | 0.0000 | [499, 181, 73, 35] | [781, 717, 653, 589] | [63.892445582586426, 25.24407252440725, 11.179173047473201, 5.942275042444821] | 0.0000 | 781 | 17012 | 0.0000 | 19.0 |
2.17 | 2.39 | 2000 | 1.5580 | 0.0000 | [524, 201, 90, 44] | [784, 720, 656, 592] | [66.83673469387755, 27.916666666666668, 13.71951219512195, 7.4324324324324325] | 0.0000 | 784 | 17012 | 0.0000 | 19.0 |
2.0889 | 2.99 | 2500 | 1.5197 | 0.0000 | [529, 214, 102, 55] | [781, 717, 653, 589] | [67.73367477592829, 29.846582984658298, 15.620214395099541, 9.33786078098472] | 0.0000 | 781 | 17012 | 0.0000 | 19.0 |
2.009 | 3.58 | 3000 | 1.4945 | 0.0000 | [527, 217, 103, 59] | [789, 725, 661, 597] | [66.7934093789607, 29.93103448275862, 15.582450832072617, 9.882747068676716] | 0.0000 | 789 | 17012 | 0.0000 | 19.0 |
1.9494 | 4.18 | 3500 | 1.4647 | 0.0000 | [518, 214, 105, 60] | [774, 710, 646, 582] | [66.9250645994832, 30.140845070422536, 16.25386996904025, 10.309278350515465] | 0.0000 | 774 | 17012 | 0.0000 | 19.0 |
1.9289 | 4.78 | 4000 | 1.4282 | 0.0000 | [539, 234, 116, 66] | [781, 717, 653, 589] | [69.01408450704226, 32.63598326359833, 17.76416539050536, 11.205432937181664] | 0.0000 | 781 | 17012 | 0.0000 | 19.0 |
1.8661 | 5.38 | 4500 | 1.4049 | 0.0000 | [520, 217, 117, 74] | [763, 699, 635, 571] | [68.15203145478375, 31.044349070100143, 18.4251968503937, 12.959719789842381] | 0.0000 | 763 | 17012 | 0.0000 | 19.0 |
1.8417 | 5.97 | 5000 | 1.3815 | 0.0000 | [536, 235, 119, 71] | [774, 710, 646, 582] | [69.25064599483204, 33.098591549295776, 18.42105263157895, 12.199312714776632] | 0.0000 | 774 | 17012 | 0.0000 | 19.0 |
1.8094 | 6.57 | 5500 | 1.3651 | 0.0000 | [528, 226, 117, 68] | [765, 701, 637, 573] | [69.01960784313725, 32.23965763195435, 18.367346938775512, 11.8673647469459] | 0.0000 | 765 | 17012 | 0.0000 | 19.0 |
1.811 | 7.17 | 6000 | 1.3629 | 0.0000 | [526, 225, 119, 69] | [768, 704, 640, 576] | [68.48958333333333, 31.960227272727273, 18.59375, 11.979166666666666] | 0.0000 | 768 | 17012 | 0.0000 | 19.0 |
1.7635 | 7.77 | 6500 | 1.3451 | 0.0000 | [529, 230, 124, 72] | [765, 701, 637, 573] | [69.15032679738562, 32.810271041369475, 19.46624803767661, 12.565445026178011] | 0.0000 | 765 | 17012 | 0.0000 | 19.0 |
1.7782 | 8.36 | 7000 | 1.3376 | 0.0000 | [530, 240, 132, 79] | [771, 707, 643, 579] | [68.74189364461738, 33.946251768033946, 20.52877138413686, 13.644214162348877] | 0.0000 | 771 | 17012 | 0.0000 | 19.0 |
1.7528 | 8.96 | 7500 | 1.3305 | 0.0000 | [543, 242, 129, 78] | [779, 715, 651, 587] | [69.70474967907573, 33.84615384615385, 19.81566820276498, 13.287904599659285] | 0.0000 | 779 | 17012 | 0.0000 | 19.0 |
1.7365 | 9.56 | 8000 | 1.3273 | 0.0000 | [532, 232, 123, 73] | [770, 706, 642, 578] | [69.0909090909091, 32.861189801699716, 19.1588785046729, 12.629757785467127] | 0.0000 | 770 | 17012 | 0.0000 | 19.0 |
1.7212 | 10.16 | 8500 | 1.3247 | 0.0000 | [544, 245, 136, 80] | [777, 713, 649, 585] | [70.01287001287001, 34.36185133239832, 20.955315870570107, 13.675213675213675] | 0.0000 | 777 | 17012 | 0.0000 | 19.0 |
1.7027 | 10.75 | 9000 | 1.3229 | 0.0000 | [548, 244, 131, 77] | [776, 712, 648, 584] | [70.61855670103093, 34.26966292134831, 20.21604938271605, 13.184931506849315] | 0.0000 | 776 | 17012 | 0.0000 | 19.0 |
1.702 | 11.35 | 9500 | 1.3198 | 0.0000 | [544, 247, 137, 82] | [774, 710, 646, 582] | [70.2842377260982, 34.7887323943662, 21.207430340557277, 14.0893470790378] | 0.0000 | 774 | 17012 | 0.0000 | 19.0 |
1.7258 | 11.95 | 10000 | 1.3193 | 0.0000 | [545, 246, 135, 80] | [777, 713, 649, 585] | [70.14157014157014, 34.50210378681627, 20.801232665639446, 13.675213675213675] | 0.0000 | 777 | 17012 | 0.0000 | 19.0 |
Framework versions
- Transformers 4.28.1
- Pytorch 2.0.0+cu118
- Datasets 2.11.0
- Tokenizers 0.13.3
- Downloads last month
- 0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.