mt5-base-finetuned-novel-chinese-to-spanish-v1

This model is a fine-tuned version of quickman/mt5-base-finetuned-chinese-to-spanish on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2288
  • Score: 0.0063
  • Counts: [609, 331, 205, 120]
  • Totals: [838, 774, 710, 646]
  • Precisions: [72.67303102625299, 42.76485788113695, 28.87323943661972, 18.575851393188856]
  • Bp: 0.0002
  • Sys Len: 838
  • Ref Len: 8089
  • Bleu: 0.0063
  • Gen Len: 19.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 40
  • training_steps: 20000

Training results

Training Loss Epoch Step Validation Loss Score Counts Totals Precisions Bp Sys Len Ref Len Bleu Gen Len
2.7093 0.28 500 1.9080 0.0035 [510, 185, 91, 37] [848, 784, 720, 656] [60.14150943396226, 23.596938775510203, 12.63888888888889, 5.640243902439025] 0.0002 848 8089 0.0035 19.0
2.4994 0.55 1000 1.7520 0.0036 [524, 199, 100, 46] [842, 778, 714, 650] [62.23277909738717, 25.57840616966581, 14.005602240896359, 7.076923076923077] 0.0002 842 8089 0.0036 19.0
2.3427 0.83 1500 1.6632 0.0040 [530, 212, 109, 53] [844, 780, 716, 652] [62.796208530805686, 27.17948717948718, 15.223463687150838, 8.128834355828221] 0.0002 844 8089 0.0040 19.0
2.211 1.1 2000 1.5980 0.0050 [548, 230, 123, 66] [855, 791, 727, 663] [64.09356725146199, 29.077117572692792, 16.91884456671252, 9.95475113122172] 0.0002 855 8089 0.0050 19.0
2.1536 1.38 2500 1.5442 0.0053 [552, 239, 137, 77] [852, 788, 724, 660] [64.78873239436619, 30.32994923857868, 18.92265193370166, 11.666666666666666] 0.0002 852 8089 0.0053 19.0
2.079 1.66 3000 1.5088 0.0055 [551, 244, 142, 84] [854, 790, 726, 662] [64.51990632318501, 30.88607594936709, 19.55922865013774, 12.688821752265861] 0.0002 854 8089 0.0055 19.0
2.0374 1.93 3500 1.4768 0.0054 [557, 259, 149, 83] [849, 785, 721, 657] [65.60659599528857, 32.99363057324841, 20.665742024965326, 12.633181126331811] 0.0002 849 8089 0.0054 19.0
2.0064 2.21 4000 1.4418 0.0054 [559, 266, 157, 91] [844, 780, 716, 652] [66.23222748815166, 34.1025641025641, 21.92737430167598, 13.957055214723926] 0.0002 844 8089 0.0054 19.0
1.9536 2.48 4500 1.4194 0.0056 [557, 260, 157, 87] [849, 785, 721, 657] [65.60659599528857, 33.12101910828026, 21.7753120665742, 13.242009132420092] 0.0002 849 8089 0.0056 19.0
1.9436 2.76 5000 1.4030 0.0051 [561, 262, 151, 85] [841, 777, 713, 649] [66.70630202140309, 33.71943371943372, 21.1781206171108, 13.097072419106317] 0.0002 841 8089 0.0051 19.0
1.8939 3.04 5500 1.3826 0.0059 [568, 277, 169, 99] [848, 784, 720, 656] [66.98113207547169, 35.33163265306123, 23.47222222222222, 15.091463414634147] 0.0002 848 8089 0.0059 19.0
1.8497 3.31 6000 1.3649 0.0059 [576, 288, 180, 107] [843, 779, 715, 651] [68.32740213523131, 36.97047496790757, 25.174825174825173, 16.43625192012289] 0.0002 843 8089 0.0059 19.0
1.8177 3.59 6500 1.3575 0.0060 [585, 285, 173, 98] [847, 783, 719, 655] [69.06729634002362, 36.39846743295019, 24.061196105702365, 14.961832061068701] 0.0002 847 8089 0.0060 19.0
1.8368 3.86 7000 1.3428 0.0061 [583, 285, 171, 95] [851, 787, 723, 659] [68.50763807285547, 36.213468869123254, 23.651452282157678, 14.41578148710167] 0.0002 851 8089 0.0061 19.0
1.7906 4.14 7500 1.3295 0.0059 [581, 284, 167, 88] [850, 786, 722, 658] [68.3529411764706, 36.1323155216285, 23.130193905817176, 13.373860182370821] 0.0002 850 8089 0.0059 19.0
1.766 4.42 8000 1.3204 0.0057 [575, 279, 161, 89] [848, 784, 720, 656] [67.80660377358491, 35.58673469387755, 22.36111111111111, 13.567073170731707] 0.0002 848 8089 0.0057 19.0
1.7615 4.69 8500 1.3124 0.0061 [590, 293, 176, 100] [848, 784, 720, 656] [69.5754716981132, 37.37244897959184, 24.444444444444443, 15.24390243902439] 0.0002 848 8089 0.0061 19.0
1.7741 4.97 9000 1.3057 0.0062 [590, 298, 180, 105] [846, 782, 718, 654] [69.73995271867612, 38.107416879795394, 25.069637883008358, 16.05504587155963] 0.0002 846 8089 0.0062 19.0
1.7266 5.24 9500 1.2969 0.0062 [592, 304, 182, 104] [846, 782, 718, 654] [69.97635933806147, 38.87468030690537, 25.348189415041784, 15.902140672782874] 0.0002 846 8089 0.0062 19.0
1.7309 5.52 10000 1.2904 0.0054 [580, 287, 166, 88] [840, 776, 712, 648] [69.04761904761905, 36.98453608247423, 23.314606741573034, 13.580246913580247] 0.0002 840 8089 0.0054 19.0
1.6973 5.79 10500 1.2818 0.0059 [591, 302, 179, 100] [842, 778, 714, 650] [70.19002375296913, 38.81748071979435, 25.07002801120448, 15.384615384615385] 0.0002 842 8089 0.0059 19.0
1.6613 6.07 11000 1.2757 0.0058 [596, 302, 185, 102] [840, 776, 712, 648] [70.95238095238095, 38.91752577319588, 25.98314606741573, 15.74074074074074] 0.0002 840 8089 0.0058 19.0
1.6699 6.35 11500 1.2689 0.0063 [600, 316, 197, 113] [842, 778, 714, 650] [71.25890736342043, 40.616966580976865, 27.591036414565828, 17.384615384615383] 0.0002 842 8089 0.0063 19.0
1.6566 6.62 12000 1.2630 0.0064 [610, 320, 194, 109] [844, 780, 716, 652] [72.27488151658768, 41.02564102564103, 27.094972067039105, 16.717791411042946] 0.0002 844 8089 0.0064 19.0
1.6417 6.9 12500 1.2592 0.0065 [606, 325, 201, 116] [843, 779, 715, 651] [71.88612099644128, 41.7201540436457, 28.111888111888113, 17.81874039938556] 0.0002 843 8089 0.0065 19.0
1.6703 7.17 13000 1.2531 0.0072 [616, 325, 198, 113] [855, 791, 727, 663] [72.046783625731, 41.08723135271808, 27.235213204951858, 17.043740573152338] 0.0002 855 8089 0.0072 19.0
1.6283 7.45 13500 1.2508 0.0069 [614, 334, 209, 122] [846, 782, 718, 654] [72.57683215130024, 42.710997442455245, 29.108635097493035, 18.654434250764528] 0.0002 846 8089 0.0069 19.0
1.6139 7.73 14000 1.2485 0.0056 [595, 315, 192, 111] [833, 769, 705, 641] [71.42857142857143, 40.96228868660598, 27.23404255319149, 17.316692667706707] 0.0002 833 8089 0.0056 19.0
1.6203 8.0 14500 1.2425 0.0067 [613, 329, 203, 119] [845, 781, 717, 653] [72.54437869822485, 42.12548015364917, 28.312412831241282, 18.223583460949463] 0.0002 845 8089 0.0067 19.0
1.6289 8.28 15000 1.2414 0.0061 [603, 322, 200, 119] [837, 773, 709, 645] [72.04301075268818, 41.65588615782665, 28.208744710860366, 18.449612403100776] 0.0002 837 8089 0.0061 19.0
1.6301 8.55 15500 1.2386 0.0063 [610, 328, 205, 123] [838, 774, 710, 646] [72.79236276849642, 42.377260981912144, 28.87323943661972, 19.040247678018577] 0.0002 838 8089 0.0063 19.0
1.5992 8.83 16000 1.2379 0.0061 [603, 323, 200, 119] [837, 773, 709, 645] [72.04301075268818, 41.785252263906855, 28.208744710860366, 18.449612403100776] 0.0002 837 8089 0.0061 19.0
1.5984 9.11 16500 1.2367 0.0060 [597, 317, 195, 116] [837, 773, 709, 645] [71.32616487455198, 41.00905562742562, 27.50352609308886, 17.984496124031008] 0.0002 837 8089 0.0060 19.0
1.6026 9.38 17000 1.2336 0.0063 [606, 326, 204, 124] [838, 774, 710, 646] [72.31503579952268, 42.11886304909561, 28.732394366197184, 19.195046439628484] 0.0002 838 8089 0.0063 19.0
1.6059 9.66 17500 1.2319 0.0061 [606, 330, 206, 123] [835, 771, 707, 643] [72.57485029940119, 42.80155642023346, 29.13719943422914, 19.12908242612753] 0.0002 835 8089 0.0061 19.0
1.6227 9.93 18000 1.2294 0.0063 [609, 334, 209, 122] [837, 773, 709, 645] [72.75985663082437, 43.20827943078913, 29.478138222849083, 18.914728682170544] 0.0002 837 8089 0.0063 19.0
1.6031 10.21 18500 1.2300 0.0060 [605, 328, 203, 120] [835, 771, 707, 643] [72.45508982035928, 42.54215304798962, 28.712871287128714, 18.662519440124417] 0.0002 835 8089 0.0060 19.0
1.5746 10.49 19000 1.2301 0.0064 [612, 335, 209, 123] [838, 774, 710, 646] [73.0310262529833, 43.281653746770026, 29.43661971830986, 19.040247678018577] 0.0002 838 8089 0.0064 19.0
1.5689 10.76 19500 1.2288 0.0063 [609, 331, 205, 120] [838, 774, 710, 646] [72.67303102625299, 42.76485788113695, 28.87323943661972, 18.575851393188856] 0.0002 838 8089 0.0063 19.0
1.5928 11.04 20000 1.2288 0.0063 [609, 331, 205, 120] [838, 774, 710, 646] [72.67303102625299, 42.76485788113695, 28.87323943661972, 18.575851393188856] 0.0002 838 8089 0.0063 19.0

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu118
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.