mt5-base-finetuned-novel-chinese-to-spanish-v1
This model is a fine-tuned version of quickman/mt5-base-finetuned-chinese-to-spanish on the None dataset. It achieves the following results on the evaluation set:
- Loss: 1.2288
- Score: 0.0063
- Counts: [609, 331, 205, 120]
- Totals: [838, 774, 710, 646]
- Precisions: [72.67303102625299, 42.76485788113695, 28.87323943661972, 18.575851393188856]
- Bp: 0.0002
- Sys Len: 838
- Ref Len: 8089
- Bleu: 0.0063
- Gen Len: 19.0
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 40
- training_steps: 20000
Training results
Training Loss | Epoch | Step | Validation Loss | Score | Counts | Totals | Precisions | Bp | Sys Len | Ref Len | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2.7093 | 0.28 | 500 | 1.9080 | 0.0035 | [510, 185, 91, 37] | [848, 784, 720, 656] | [60.14150943396226, 23.596938775510203, 12.63888888888889, 5.640243902439025] | 0.0002 | 848 | 8089 | 0.0035 | 19.0 |
2.4994 | 0.55 | 1000 | 1.7520 | 0.0036 | [524, 199, 100, 46] | [842, 778, 714, 650] | [62.23277909738717, 25.57840616966581, 14.005602240896359, 7.076923076923077] | 0.0002 | 842 | 8089 | 0.0036 | 19.0 |
2.3427 | 0.83 | 1500 | 1.6632 | 0.0040 | [530, 212, 109, 53] | [844, 780, 716, 652] | [62.796208530805686, 27.17948717948718, 15.223463687150838, 8.128834355828221] | 0.0002 | 844 | 8089 | 0.0040 | 19.0 |
2.211 | 1.1 | 2000 | 1.5980 | 0.0050 | [548, 230, 123, 66] | [855, 791, 727, 663] | [64.09356725146199, 29.077117572692792, 16.91884456671252, 9.95475113122172] | 0.0002 | 855 | 8089 | 0.0050 | 19.0 |
2.1536 | 1.38 | 2500 | 1.5442 | 0.0053 | [552, 239, 137, 77] | [852, 788, 724, 660] | [64.78873239436619, 30.32994923857868, 18.92265193370166, 11.666666666666666] | 0.0002 | 852 | 8089 | 0.0053 | 19.0 |
2.079 | 1.66 | 3000 | 1.5088 | 0.0055 | [551, 244, 142, 84] | [854, 790, 726, 662] | [64.51990632318501, 30.88607594936709, 19.55922865013774, 12.688821752265861] | 0.0002 | 854 | 8089 | 0.0055 | 19.0 |
2.0374 | 1.93 | 3500 | 1.4768 | 0.0054 | [557, 259, 149, 83] | [849, 785, 721, 657] | [65.60659599528857, 32.99363057324841, 20.665742024965326, 12.633181126331811] | 0.0002 | 849 | 8089 | 0.0054 | 19.0 |
2.0064 | 2.21 | 4000 | 1.4418 | 0.0054 | [559, 266, 157, 91] | [844, 780, 716, 652] | [66.23222748815166, 34.1025641025641, 21.92737430167598, 13.957055214723926] | 0.0002 | 844 | 8089 | 0.0054 | 19.0 |
1.9536 | 2.48 | 4500 | 1.4194 | 0.0056 | [557, 260, 157, 87] | [849, 785, 721, 657] | [65.60659599528857, 33.12101910828026, 21.7753120665742, 13.242009132420092] | 0.0002 | 849 | 8089 | 0.0056 | 19.0 |
1.9436 | 2.76 | 5000 | 1.4030 | 0.0051 | [561, 262, 151, 85] | [841, 777, 713, 649] | [66.70630202140309, 33.71943371943372, 21.1781206171108, 13.097072419106317] | 0.0002 | 841 | 8089 | 0.0051 | 19.0 |
1.8939 | 3.04 | 5500 | 1.3826 | 0.0059 | [568, 277, 169, 99] | [848, 784, 720, 656] | [66.98113207547169, 35.33163265306123, 23.47222222222222, 15.091463414634147] | 0.0002 | 848 | 8089 | 0.0059 | 19.0 |
1.8497 | 3.31 | 6000 | 1.3649 | 0.0059 | [576, 288, 180, 107] | [843, 779, 715, 651] | [68.32740213523131, 36.97047496790757, 25.174825174825173, 16.43625192012289] | 0.0002 | 843 | 8089 | 0.0059 | 19.0 |
1.8177 | 3.59 | 6500 | 1.3575 | 0.0060 | [585, 285, 173, 98] | [847, 783, 719, 655] | [69.06729634002362, 36.39846743295019, 24.061196105702365, 14.961832061068701] | 0.0002 | 847 | 8089 | 0.0060 | 19.0 |
1.8368 | 3.86 | 7000 | 1.3428 | 0.0061 | [583, 285, 171, 95] | [851, 787, 723, 659] | [68.50763807285547, 36.213468869123254, 23.651452282157678, 14.41578148710167] | 0.0002 | 851 | 8089 | 0.0061 | 19.0 |
1.7906 | 4.14 | 7500 | 1.3295 | 0.0059 | [581, 284, 167, 88] | [850, 786, 722, 658] | [68.3529411764706, 36.1323155216285, 23.130193905817176, 13.373860182370821] | 0.0002 | 850 | 8089 | 0.0059 | 19.0 |
1.766 | 4.42 | 8000 | 1.3204 | 0.0057 | [575, 279, 161, 89] | [848, 784, 720, 656] | [67.80660377358491, 35.58673469387755, 22.36111111111111, 13.567073170731707] | 0.0002 | 848 | 8089 | 0.0057 | 19.0 |
1.7615 | 4.69 | 8500 | 1.3124 | 0.0061 | [590, 293, 176, 100] | [848, 784, 720, 656] | [69.5754716981132, 37.37244897959184, 24.444444444444443, 15.24390243902439] | 0.0002 | 848 | 8089 | 0.0061 | 19.0 |
1.7741 | 4.97 | 9000 | 1.3057 | 0.0062 | [590, 298, 180, 105] | [846, 782, 718, 654] | [69.73995271867612, 38.107416879795394, 25.069637883008358, 16.05504587155963] | 0.0002 | 846 | 8089 | 0.0062 | 19.0 |
1.7266 | 5.24 | 9500 | 1.2969 | 0.0062 | [592, 304, 182, 104] | [846, 782, 718, 654] | [69.97635933806147, 38.87468030690537, 25.348189415041784, 15.902140672782874] | 0.0002 | 846 | 8089 | 0.0062 | 19.0 |
1.7309 | 5.52 | 10000 | 1.2904 | 0.0054 | [580, 287, 166, 88] | [840, 776, 712, 648] | [69.04761904761905, 36.98453608247423, 23.314606741573034, 13.580246913580247] | 0.0002 | 840 | 8089 | 0.0054 | 19.0 |
1.6973 | 5.79 | 10500 | 1.2818 | 0.0059 | [591, 302, 179, 100] | [842, 778, 714, 650] | [70.19002375296913, 38.81748071979435, 25.07002801120448, 15.384615384615385] | 0.0002 | 842 | 8089 | 0.0059 | 19.0 |
1.6613 | 6.07 | 11000 | 1.2757 | 0.0058 | [596, 302, 185, 102] | [840, 776, 712, 648] | [70.95238095238095, 38.91752577319588, 25.98314606741573, 15.74074074074074] | 0.0002 | 840 | 8089 | 0.0058 | 19.0 |
1.6699 | 6.35 | 11500 | 1.2689 | 0.0063 | [600, 316, 197, 113] | [842, 778, 714, 650] | [71.25890736342043, 40.616966580976865, 27.591036414565828, 17.384615384615383] | 0.0002 | 842 | 8089 | 0.0063 | 19.0 |
1.6566 | 6.62 | 12000 | 1.2630 | 0.0064 | [610, 320, 194, 109] | [844, 780, 716, 652] | [72.27488151658768, 41.02564102564103, 27.094972067039105, 16.717791411042946] | 0.0002 | 844 | 8089 | 0.0064 | 19.0 |
1.6417 | 6.9 | 12500 | 1.2592 | 0.0065 | [606, 325, 201, 116] | [843, 779, 715, 651] | [71.88612099644128, 41.7201540436457, 28.111888111888113, 17.81874039938556] | 0.0002 | 843 | 8089 | 0.0065 | 19.0 |
1.6703 | 7.17 | 13000 | 1.2531 | 0.0072 | [616, 325, 198, 113] | [855, 791, 727, 663] | [72.046783625731, 41.08723135271808, 27.235213204951858, 17.043740573152338] | 0.0002 | 855 | 8089 | 0.0072 | 19.0 |
1.6283 | 7.45 | 13500 | 1.2508 | 0.0069 | [614, 334, 209, 122] | [846, 782, 718, 654] | [72.57683215130024, 42.710997442455245, 29.108635097493035, 18.654434250764528] | 0.0002 | 846 | 8089 | 0.0069 | 19.0 |
1.6139 | 7.73 | 14000 | 1.2485 | 0.0056 | [595, 315, 192, 111] | [833, 769, 705, 641] | [71.42857142857143, 40.96228868660598, 27.23404255319149, 17.316692667706707] | 0.0002 | 833 | 8089 | 0.0056 | 19.0 |
1.6203 | 8.0 | 14500 | 1.2425 | 0.0067 | [613, 329, 203, 119] | [845, 781, 717, 653] | [72.54437869822485, 42.12548015364917, 28.312412831241282, 18.223583460949463] | 0.0002 | 845 | 8089 | 0.0067 | 19.0 |
1.6289 | 8.28 | 15000 | 1.2414 | 0.0061 | [603, 322, 200, 119] | [837, 773, 709, 645] | [72.04301075268818, 41.65588615782665, 28.208744710860366, 18.449612403100776] | 0.0002 | 837 | 8089 | 0.0061 | 19.0 |
1.6301 | 8.55 | 15500 | 1.2386 | 0.0063 | [610, 328, 205, 123] | [838, 774, 710, 646] | [72.79236276849642, 42.377260981912144, 28.87323943661972, 19.040247678018577] | 0.0002 | 838 | 8089 | 0.0063 | 19.0 |
1.5992 | 8.83 | 16000 | 1.2379 | 0.0061 | [603, 323, 200, 119] | [837, 773, 709, 645] | [72.04301075268818, 41.785252263906855, 28.208744710860366, 18.449612403100776] | 0.0002 | 837 | 8089 | 0.0061 | 19.0 |
1.5984 | 9.11 | 16500 | 1.2367 | 0.0060 | [597, 317, 195, 116] | [837, 773, 709, 645] | [71.32616487455198, 41.00905562742562, 27.50352609308886, 17.984496124031008] | 0.0002 | 837 | 8089 | 0.0060 | 19.0 |
1.6026 | 9.38 | 17000 | 1.2336 | 0.0063 | [606, 326, 204, 124] | [838, 774, 710, 646] | [72.31503579952268, 42.11886304909561, 28.732394366197184, 19.195046439628484] | 0.0002 | 838 | 8089 | 0.0063 | 19.0 |
1.6059 | 9.66 | 17500 | 1.2319 | 0.0061 | [606, 330, 206, 123] | [835, 771, 707, 643] | [72.57485029940119, 42.80155642023346, 29.13719943422914, 19.12908242612753] | 0.0002 | 835 | 8089 | 0.0061 | 19.0 |
1.6227 | 9.93 | 18000 | 1.2294 | 0.0063 | [609, 334, 209, 122] | [837, 773, 709, 645] | [72.75985663082437, 43.20827943078913, 29.478138222849083, 18.914728682170544] | 0.0002 | 837 | 8089 | 0.0063 | 19.0 |
1.6031 | 10.21 | 18500 | 1.2300 | 0.0060 | [605, 328, 203, 120] | [835, 771, 707, 643] | [72.45508982035928, 42.54215304798962, 28.712871287128714, 18.662519440124417] | 0.0002 | 835 | 8089 | 0.0060 | 19.0 |
1.5746 | 10.49 | 19000 | 1.2301 | 0.0064 | [612, 335, 209, 123] | [838, 774, 710, 646] | [73.0310262529833, 43.281653746770026, 29.43661971830986, 19.040247678018577] | 0.0002 | 838 | 8089 | 0.0064 | 19.0 |
1.5689 | 10.76 | 19500 | 1.2288 | 0.0063 | [609, 331, 205, 120] | [838, 774, 710, 646] | [72.67303102625299, 42.76485788113695, 28.87323943661972, 18.575851393188856] | 0.0002 | 838 | 8089 | 0.0063 | 19.0 |
1.5928 | 11.04 | 20000 | 1.2288 | 0.0063 | [609, 331, 205, 120] | [838, 774, 710, 646] | [72.67303102625299, 42.76485788113695, 28.87323943661972, 18.575851393188856] | 0.0002 | 838 | 8089 | 0.0063 | 19.0 |
Framework versions
- Transformers 4.28.1
- Pytorch 2.0.0+cu118
- Datasets 2.11.0
- Tokenizers 0.13.3
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.