quen_2.5_qlora

This model is a fine-tuned version of Qwen/Qwen2.5-0.5B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7671

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 0
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.1118 0.0811 3 3.2024
3.0902 0.1622 6 3.2023
2.9193 0.2432 9 3.1908
3.0231 0.3243 12 3.1700
3.0429 0.4054 15 3.1323
2.9739 0.4865 18 3.0926
2.9118 0.5676 21 3.0477
2.8564 0.6486 24 2.9984
2.9689 0.7297 27 2.9489
2.8155 0.8108 30 2.8946
2.8495 0.8919 33 2.8411
2.7311 0.9730 36 2.7843
2.6327 1.0541 39 2.7375
2.6728 1.1351 42 2.6747
2.5236 1.2162 45 2.6151
2.5009 1.2973 48 2.5595
2.4129 1.3784 51 2.5030
2.3506 1.4595 54 2.4369
2.3542 1.5405 57 2.3803
2.237 1.6216 60 2.3189
2.3302 1.7027 63 2.2589
2.1679 1.7838 66 2.2005
2.1406 1.8649 69 2.1355
2.0373 1.9459 72 2.0689
1.9979 2.0270 75 2.0098
2.0711 2.1081 78 1.9351
1.8819 2.1892 81 1.8753
1.7832 2.2703 84 1.8243
1.7165 2.3514 87 1.7668
1.6765 2.4324 90 1.7111
1.6243 2.5135 93 1.6613
1.6819 2.5946 96 1.6154
1.6006 2.6757 99 1.5679
1.4845 2.7568 102 1.5229
1.4014 2.8378 105 1.4728
1.452 2.9189 108 1.4313
1.3642 3.0 111 1.3985
1.3571 3.0811 114 1.3647
1.4434 3.1622 117 1.3416
1.3829 3.2432 120 1.3206
1.3319 3.3243 123 1.3000
1.2262 3.4054 126 1.2851
1.2693 3.4865 129 1.2711
1.264 3.5676 132 1.2588
1.1851 3.6486 135 1.2470
1.2656 3.7297 138 1.2380
1.205 3.8108 141 1.2244
1.2115 3.8919 144 1.2109
1.1366 3.9730 147 1.1997
1.1794 4.0541 150 1.1929
1.171 4.1351 153 1.1805
1.0875 4.2162 156 1.1714
1.1192 4.2973 159 1.1619
1.1082 4.3784 162 1.1548
1.0923 4.4595 165 1.1465
1.1731 4.5405 168 1.1397
1.2138 4.6216 171 1.1315
1.2322 4.7027 174 1.1223
1.0893 4.7838 177 1.1179
1.1414 4.8649 180 1.1116
1.0242 4.9459 183 1.1058
1.0929 5.0270 186 1.0982
1.0967 5.1081 189 1.0916
1.0666 5.1892 192 1.0866
1.1382 5.2703 195 1.0851
1.0709 5.3514 198 1.0781
1.0904 5.4324 201 1.0742
1.0521 5.5135 204 1.0699
0.9745 5.5946 207 1.0641
1.1268 5.6757 210 1.0624
1.0437 5.7568 213 1.0588
1.0124 5.8378 216 1.0578
1.0217 5.9189 219 1.0511
1.0656 6.0 222 1.0475
1.0841 6.0811 225 1.0454
1.1352 6.1622 228 1.0440
1.0402 6.2432 231 1.0407
1.0042 6.3243 234 1.0326
0.9738 6.4054 237 1.0295
0.9671 6.4865 240 1.0292
1.0515 6.5676 243 1.0282
0.9551 6.6486 246 1.0246
1.0084 6.7297 249 1.0210
0.9964 6.8108 252 1.0188
0.966 6.8919 255 1.0146
1.0284 6.9730 258 1.0113
1.0367 7.0541 261 1.0103
0.9015 7.1351 264 1.0056
0.9097 7.2162 267 1.0043
0.9598 7.2973 270 0.9998
1.1724 7.3784 273 0.9980
0.9607 7.4595 276 0.9947
0.9749 7.5405 279 0.9960
0.9704 7.6216 282 0.9924
0.9855 7.7027 285 0.9899
0.9368 7.7838 288 0.9853
0.9812 7.8649 291 0.9856
0.9809 7.9459 294 0.9809
0.9337 8.0270 297 0.9824
0.8848 8.1081 300 0.9759
0.876 8.1892 303 0.9751
0.97 8.2703 306 0.9734
1.1309 8.3514 309 0.9726
0.9012 8.4324 312 0.9709
1.0355 8.5135 315 0.9681
0.9047 8.5946 318 0.9663
0.9417 8.6757 321 0.9622
0.9266 8.7568 324 0.9613
0.8873 8.8378 327 0.9625
0.9364 8.9189 330 0.9598
0.9885 9.0 333 0.9539
0.91 9.0811 336 0.9538
1.0834 9.1622 339 0.9522
0.8694 9.2432 342 0.9502
1.0554 9.3243 345 0.9498
0.9123 9.4054 348 0.9484
0.9542 9.4865 351 0.9449
0.9287 9.5676 354 0.9417
0.8435 9.6486 357 0.9406
0.8419 9.7297 360 0.9374
0.8265 9.8108 363 0.9351
0.8677 9.8919 366 0.9343
0.9989 9.9730 369 0.9341
0.8331 10.0541 372 0.9329
0.8823 10.1351 375 0.9312
0.9312 10.2162 378 0.9299
0.8736 10.2973 381 0.9279
1.0039 10.3784 384 0.9259
0.8728 10.4595 387 0.9249
0.9946 10.5405 390 0.9216
0.8747 10.6216 393 0.9196
0.9043 10.7027 396 0.9183
0.8777 10.7838 399 0.9163
0.8023 10.8649 402 0.9160
0.9002 10.9459 405 0.9128
0.8568 11.0270 408 0.9146
0.9145 11.1081 411 0.9124
0.896 11.1892 414 0.9101
0.8151 11.2703 417 0.9067
0.8352 11.3514 420 0.9060
1.0484 11.4324 423 0.9044
0.8623 11.5135 426 0.9041
0.8942 11.5946 429 0.9022
0.8789 11.6757 432 0.9008
0.9077 11.7568 435 0.9000
0.8 11.8378 438 0.8999
0.8116 11.9189 441 0.8985
0.8382 12.0 444 0.8970
0.8267 12.0811 447 0.8962
0.9168 12.1622 450 0.8947
0.7878 12.2432 453 0.8949
0.8829 12.3243 456 0.8926
0.8465 12.4054 459 0.8935
0.882 12.4865 462 0.8914
0.8265 12.5676 465 0.8896
0.8846 12.6486 468 0.8891
0.7943 12.7297 471 0.8899
0.8846 12.8108 474 0.8883
0.8177 12.8919 477 0.8852
0.8807 12.9730 480 0.8840
0.893 13.0541 483 0.8822
0.7497 13.1351 486 0.8799
0.8803 13.2162 489 0.8785
0.917 13.2973 492 0.8797
0.8162 13.3784 495 0.8783
0.9659 13.4595 498 0.8799
0.8163 13.5405 501 0.8790
0.8139 13.6216 504 0.8753
0.7451 13.7027 507 0.8737
0.7726 13.7838 510 0.8738
0.821 13.8649 513 0.8714
0.8782 13.9459 516 0.8689
0.8563 14.0270 519 0.8700
0.8645 14.1081 522 0.8671
0.9223 14.1892 525 0.8676
0.8396 14.2703 528 0.8682
0.8224 14.3514 531 0.8679
0.7793 14.4324 534 0.8690
0.7911 14.5135 537 0.8650
0.7706 14.5946 540 0.8634
0.7983 14.6757 543 0.8606
0.7409 14.7568 546 0.8611
0.8634 14.8378 549 0.8594
0.8631 14.9189 552 0.8594
0.7624 15.0 555 0.8582
0.8568 15.0811 558 0.8568
0.8833 15.1622 561 0.8542
0.7804 15.2432 564 0.8514
0.9113 15.3243 567 0.8526
0.72 15.4054 570 0.8511
0.8631 15.4865 573 0.8498
0.6486 15.5676 576 0.8507
0.7745 15.6486 579 0.8507
0.8591 15.7297 582 0.8519
0.8479 15.8108 585 0.8515
0.7879 15.8919 588 0.8486
0.7829 15.9730 591 0.8483
0.7271 16.0541 594 0.8451
0.7425 16.1351 597 0.8464
0.7692 16.2162 600 0.8436
0.7492 16.2973 603 0.8459
0.7085 16.3784 606 0.8440
0.7351 16.4595 609 0.8429
0.8593 16.5405 612 0.8412
0.8999 16.6216 615 0.8400
0.8136 16.7027 618 0.8417
0.8415 16.7838 621 0.8393
0.849 16.8649 624 0.8400
0.798 16.9459 627 0.8409
0.7581 17.0270 630 0.8393
0.7244 17.1081 633 0.8376
0.7221 17.1892 636 0.8379
0.8085 17.2703 639 0.8363
0.7364 17.3514 642 0.8357
0.7684 17.4324 645 0.8385
0.7323 17.5135 648 0.8330
0.8999 17.5946 651 0.8334
0.7243 17.6757 654 0.8323
0.741 17.7568 657 0.8300
0.7529 17.8378 660 0.8290
0.9129 17.9189 663 0.8313
0.8043 18.0 666 0.8282
0.838 18.0811 669 0.8294
0.8133 18.1622 672 0.8294
0.708 18.2432 675 0.8283
0.8017 18.3243 678 0.8289
0.9087 18.4054 681 0.8256
0.6983 18.4865 684 0.8260
0.7684 18.5676 687 0.8251
0.6372 18.6486 690 0.8241
0.7369 18.7297 693 0.8261
0.819 18.8108 696 0.8231
0.6806 18.8919 699 0.8244
0.8233 18.9730 702 0.8237
0.7614 19.0541 705 0.8234
0.7862 19.1351 708 0.8212
0.6605 19.2162 711 0.8195
0.8229 19.2973 714 0.8194
0.731 19.3784 717 0.8193
0.783 19.4595 720 0.8187
0.6495 19.5405 723 0.8185
0.8034 19.6216 726 0.8173
0.7064 19.7027 729 0.8177
0.7276 19.7838 732 0.8156
0.8778 19.8649 735 0.8143
0.7399 19.9459 738 0.8126
0.7399 20.0270 741 0.8139
0.7912 20.1081 744 0.8153
0.8063 20.1892 747 0.8120
0.671 20.2703 750 0.8105
0.6342 20.3514 753 0.8113
0.8271 20.4324 756 0.8097
0.6755 20.5135 759 0.8099
0.762 20.5946 762 0.8084
0.9197 20.6757 765 0.8079
0.7844 20.7568 768 0.8094
0.6502 20.8378 771 0.8096
0.6877 20.9189 774 0.8078
0.7705 21.0 777 0.8074
0.6842 21.0811 780 0.8067
0.6799 21.1622 783 0.8066
0.7344 21.2432 786 0.8039
0.6554 21.3243 789 0.8051
0.7007 21.4054 792 0.8051
0.6292 21.4865 795 0.8040
0.7404 21.5676 798 0.8031
0.7976 21.6486 801 0.8030
1.0023 21.7297 804 0.8045
0.6626 21.8108 807 0.8042
0.8212 21.8919 810 0.8019
0.6982 21.9730 813 0.8029
0.7221 22.0541 816 0.8007
0.8042 22.1351 819 0.8013
0.775 22.2162 822 0.7999
0.7614 22.2973 825 0.7971
0.7132 22.3784 828 0.7972
0.647 22.4595 831 0.8001
0.692 22.5405 834 0.7968
0.7775 22.6216 837 0.7976
0.7764 22.7027 840 0.7980
0.6962 22.7838 843 0.7990
0.649 22.8649 846 0.7941
0.7853 22.9459 849 0.7940
0.6808 23.0270 852 0.7959
0.7018 23.1081 855 0.7948
0.6743 23.1892 858 0.7949
0.7512 23.2703 861 0.7926
0.7136 23.3514 864 0.7922
0.6847 23.4324 867 0.7943
0.7309 23.5135 870 0.7910
0.9269 23.5946 873 0.7903
0.7629 23.6757 876 0.7923
0.6551 23.7568 879 0.7896
0.7346 23.8378 882 0.7920
0.7469 23.9189 885 0.7921
0.6343 24.0 888 0.7925
0.7232 24.0811 891 0.7905
0.714 24.1622 894 0.7885
0.6603 24.2432 897 0.7885
0.6798 24.3243 900 0.7892
0.7163 24.4054 903 0.7889
0.8428 24.4865 906 0.7871
0.8125 24.5676 909 0.7875
0.7566 24.6486 912 0.7895
0.7167 24.7297 915 0.7874
0.65 24.8108 918 0.7861
0.6817 24.8919 921 0.7860
0.6185 24.9730 924 0.7879
0.7745 25.0541 927 0.7900
0.8923 25.1351 930 0.7871
0.7395 25.2162 933 0.7834
0.7361 25.2973 936 0.7842
0.676 25.3784 939 0.7835
0.7359 25.4595 942 0.7846
0.6533 25.5405 945 0.7829
0.6989 25.6216 948 0.7848
0.8168 25.7027 951 0.7837
0.6519 25.7838 954 0.7826
0.6951 25.8649 957 0.7840
0.5833 25.9459 960 0.7823
0.617 26.0270 963 0.7812
0.6778 26.1081 966 0.7829
0.6571 26.1892 969 0.7818
0.7921 26.2703 972 0.7811
0.6962 26.3514 975 0.7807
0.8437 26.4324 978 0.7814
0.6147 26.5135 981 0.7813
0.7263 26.5946 984 0.7803
0.6618 26.6757 987 0.7809
0.7502 26.7568 990 0.7835
0.6094 26.8378 993 0.7813
0.6506 26.9189 996 0.7800
0.7927 27.0 999 0.7807
0.8217 27.0811 1002 0.7816
0.5989 27.1622 1005 0.7786
0.7204 27.2432 1008 0.7782
0.6906 27.3243 1011 0.7785
0.7592 27.4054 1014 0.7777
0.8692 27.4865 1017 0.7778
0.587 27.5676 1020 0.7781
0.7139 27.6486 1023 0.7781
0.6367 27.7297 1026 0.7774
0.6502 27.8108 1029 0.7779
0.6956 27.8919 1032 0.7770
0.7137 27.9730 1035 0.7784
0.7062 28.0541 1038 0.7775
0.6568 28.1351 1041 0.7782
0.7612 28.2162 1044 0.7766
0.7177 28.2973 1047 0.7765
0.6537 28.3784 1050 0.7776
0.6742 28.4595 1053 0.7773
0.724 28.5405 1056 0.7778
0.8277 28.6216 1059 0.7763
0.609 28.7027 1062 0.7767
0.5459 28.7838 1065 0.7781
0.8179 28.8649 1068 0.7769
0.6419 28.9459 1071 0.7754
0.6947 29.0270 1074 0.7745
0.71 29.1081 1077 0.7751
0.5664 29.1892 1080 0.7747
0.7678 29.2703 1083 0.7741
0.7829 29.3514 1086 0.7732
0.7294 29.4324 1089 0.7729
0.679 29.5135 1092 0.7752
0.6838 29.5946 1095 0.7739
0.6918 29.6757 1098 0.7731
0.6563 29.7568 1101 0.7727
0.6946 29.8378 1104 0.7742
0.6884 29.9189 1107 0.7729
0.6851 30.0 1110 0.7744
0.5865 30.0811 1113 0.7744
0.6906 30.1622 1116 0.7746
0.6716 30.2432 1119 0.7751
0.7492 30.3243 1122 0.7736
0.6608 30.4054 1125 0.7709
0.6577 30.4865 1128 0.7710
0.7475 30.5676 1131 0.7722
0.7643 30.6486 1134 0.7694
0.8186 30.7297 1137 0.7711
0.7465 30.8108 1140 0.7724
0.6621 30.8919 1143 0.7728
0.579 30.9730 1146 0.7713
0.6014 31.0541 1149 0.7707
0.7663 31.1351 1152 0.7697
0.6609 31.2162 1155 0.7719
0.7984 31.2973 1158 0.7715
0.7166 31.3784 1161 0.7715
0.7276 31.4595 1164 0.7712
0.6179 31.5405 1167 0.7681
0.621 31.6216 1170 0.7707
0.743 31.7027 1173 0.7711
0.6733 31.7838 1176 0.7706
0.6066 31.8649 1179 0.7716
0.6787 31.9459 1182 0.7713
0.812 32.0270 1185 0.7708
0.7022 32.1081 1188 0.7695
0.6709 32.1892 1191 0.7721
0.5536 32.2703 1194 0.7716
0.7462 32.3514 1197 0.7693
0.7817 32.4324 1200 0.7686
0.7624 32.5135 1203 0.7719
0.6346 32.5946 1206 0.7696
0.6284 32.6757 1209 0.7717
0.654 32.7568 1212 0.7708
0.6406 32.8378 1215 0.7707
0.752 32.9189 1218 0.7690
0.5956 33.0 1221 0.7713
0.6772 33.0811 1224 0.7701
0.5034 33.1622 1227 0.7693
0.6292 33.2432 1230 0.7687
0.7374 33.3243 1233 0.7693
0.5385 33.4054 1236 0.7678
0.7198 33.4865 1239 0.7686
0.7642 33.5676 1242 0.7695
0.6984 33.6486 1245 0.7682
0.776 33.7297 1248 0.7694
0.7073 33.8108 1251 0.7700
0.8016 33.8919 1254 0.7683
0.7812 33.9730 1257 0.7662
0.578 34.0541 1260 0.7681
0.7074 34.1351 1263 0.7679
0.6142 34.2162 1266 0.7669
0.6469 34.2973 1269 0.7675
0.7334 34.3784 1272 0.7669
0.7086 34.4595 1275 0.7675
0.6302 34.5405 1278 0.7684
0.6576 34.6216 1281 0.7687
0.7496 34.7027 1284 0.7667
0.7836 34.7838 1287 0.7671
0.6405 34.8649 1290 0.7679
0.6088 34.9459 1293 0.7681
0.7511 35.0270 1296 0.7697
0.7591 35.1081 1299 0.7680
0.6207 35.1892 1302 0.7689
0.651 35.2703 1305 0.7660
0.6758 35.3514 1308 0.7694
0.7277 35.4324 1311 0.7680
0.7118 35.5135 1314 0.7663
0.706 35.5946 1317 0.7669
0.6595 35.6757 1320 0.7678
0.7253 35.7568 1323 0.7688
0.582 35.8378 1326 0.7694
0.7094 35.9189 1329 0.7684
0.7042 36.0 1332 0.7686
0.6486 36.0811 1335 0.7664
0.6691 36.1622 1338 0.7678
0.7937 36.2432 1341 0.7672
0.6343 36.3243 1344 0.7670
0.76 36.4054 1347 0.7674
0.5929 36.4865 1350 0.7690
0.6485 36.5676 1353 0.7674
0.7451 36.6486 1356 0.7663
0.6367 36.7297 1359 0.7691
0.6192 36.8108 1362 0.7656
0.7042 36.8919 1365 0.7682
0.6886 36.9730 1368 0.7682
0.8111 37.0541 1371 0.7690
0.6443 37.1351 1374 0.7690
0.6387 37.2162 1377 0.7679
0.7861 37.2973 1380 0.7668
0.6397 37.3784 1383 0.7670
0.6467 37.4595 1386 0.7688
0.6977 37.5405 1389 0.7699
0.7833 37.6216 1392 0.7672
0.6164 37.7027 1395 0.7675
0.7457 37.7838 1398 0.7690
0.6477 37.8649 1401 0.7681
0.7251 37.9459 1404 0.7674
0.5642 38.0270 1407 0.7672
0.786 38.1081 1410 0.7660
0.7014 38.1892 1413 0.7683
0.7747 38.2703 1416 0.7659
0.631 38.3514 1419 0.7675
0.6787 38.4324 1422 0.7671
0.6388 38.5135 1425 0.7684
0.5745 38.5946 1428 0.7666
0.6253 38.6757 1431 0.7668
0.6611 38.7568 1434 0.7688
0.6129 38.8378 1437 0.7666
0.7588 38.9189 1440 0.7670
0.7615 39.0 1443 0.7664
0.7449 39.0811 1446 0.7674
0.7012 39.1622 1449 0.7666
0.788 39.2432 1452 0.7676
0.5958 39.3243 1455 0.7666
0.675 39.4054 1458 0.7670
0.6761 39.4865 1461 0.7676
0.7046 39.5676 1464 0.7669
0.6198 39.6486 1467 0.7673
0.6646 39.7297 1470 0.7670
0.6161 39.8108 1473 0.7677
0.7152 39.8919 1476 0.7676
0.7365 39.9730 1479 0.7671

Framework versions

  • PEFT 0.13.2
  • Transformers 4.46.0
  • Pytorch 2.4.0
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
2
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for sujithatz/quen_2.5_qlora

Base model

Qwen/Qwen2.5-0.5B
Adapter
(58)
this model