End of training

Browse files

Files changed (8) hide show

README.md +20 -20
final_checkpoint/model-00001-of-00003.safetensors +1 -1
final_checkpoint/model-00002-of-00003.safetensors +1 -1
final_checkpoint/model-00003-of-00003.safetensors +1 -1
model-00001-of-00003.safetensors +1 -1
model-00002-of-00003.safetensors +1 -1
model-00003-of-00003.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [mosaicml/mpt-7b-instruct](https://huggingface.co/mosaicml/mpt-7b-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.5250
-- Rewards/chosen: -11.2880
-- Rewards/rejected: -11.0941
-- Rewards/accuracies: 0.4989
-- Rewards/margins: -0.1939
-- Logps/rejected: -58.5379
-- Logps/chosen: -58.4188
-- Logits/rejected: 8.0116
-- Logits/chosen: 8.0111
 ## Model description
@@ -44,7 +44,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 1e-05
 - train_batch_size: 2
 - eval_batch_size: 1
 - seed: 42
@@ -59,16 +59,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 2.9156        | 0.1   | 100  | 3.4899          | -15.5009       | -15.3977         | 0.4725             | -0.1031         | -72.8833       | -72.4618     | 42.9070         | 42.9069       |
-| 2.1578        | 0.2   | 200  | 2.2150          | -9.7540        | -9.4579          | 0.4549             | -0.2961         | -53.0838       | -53.3056     | 12.3838         | 12.3837       |
-| 2.8766        | 0.29  | 300  | 2.2464          | -12.1892       | -11.8037         | 0.4593             | -0.3856         | -60.9030       | -61.4230     | 14.3112         | 14.3107       |
-| 2.5297        | 0.39  | 400  | 1.8130          | -10.1066       | -9.8997          | 0.4879             | -0.2069         | -54.5563       | -54.4807     | 10.1726         | 10.1723       |
-| 2.2297        | 0.49  | 500  | 1.9150          | -10.8801       | -10.6958         | 0.4923             | -0.1843         | -57.2102       | -57.0592     | 9.5803          | 9.5795        |
-| 1.8734        | 0.59  | 600  | 1.6431          | -11.3789       | -11.2312         | 0.4835             | -0.1477         | -58.9947       | -58.7219     | 11.7059         | 11.7051       |
-| 1.7922        | 0.68  | 700  | 1.6689          | -12.0398       | -11.8183         | 0.4879             | -0.2215         | -60.9518       | -60.9247     | 8.8943          | 8.8938        |
-| 1.268         | 0.78  | 800  | 1.5363          | -11.2805       | -11.0909         | 0.5033             | -0.1895         | -58.5271       | -58.3937     | 8.4559          | 8.4554        |
-| 1.7688        | 0.88  | 900  | 1.5253          | -11.2773       | -11.0831         | 0.4945             | -0.1942         | -58.5010       | -58.3831     | 8.0320          | 8.0315        |
-| 1.0648        | 0.98  | 1000 | 1.5250          | -11.2880       | -11.0941         | 0.4989             | -0.1939         | -58.5379       | -58.4188     | 8.0116          | 8.0111        |
 ### Framework versions

 This model is a fine-tuned version of [mosaicml/mpt-7b-instruct](https://huggingface.co/mosaicml/mpt-7b-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6933
+- Rewards/chosen: -0.0008
+- Rewards/rejected: -0.0019
+- Rewards/accuracies: 0.5187
+- Rewards/margins: 0.0011
+- Logps/rejected: -21.5638
+- Logps/chosen: -20.7947
+- Logits/rejected: 14.2524
+- Logits/chosen: 14.2550
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-08
 - train_batch_size: 2
 - eval_batch_size: 1
 - seed: 42
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6965        | 0.1   | 100  | 0.6951          | -0.0017        | 0.0013           | 0.4681             | -0.0029         | -21.5532       | -20.7977     | 14.2557         | 14.2583       |
+| 0.6918        | 0.2   | 200  | 0.6942          | -0.0054        | -0.0044          | 0.5011             | -0.0010         | -21.5722       | -20.8104     | 14.2575         | 14.2601       |
+| 0.6965        | 0.29  | 300  | 0.6941          | -0.0016        | -0.0010          | 0.4945             | -0.0006         | -21.5608       | -20.7975     | 14.2549         | 14.2575       |
+| 0.6906        | 0.39  | 400  | 0.6946          | 0.0001         | 0.0020           | 0.4747             | -0.0019         | -21.5507       | -20.7919     | 14.2494         | 14.2520       |
+| 0.6883        | 0.49  | 500  | 0.6972          | -0.0019        | 0.0050           | 0.4484             | -0.0069         | -21.5408       | -20.7986     | 14.2521         | 14.2547       |
+| 0.6867        | 0.59  | 600  | 0.6969          | -0.0054        | 0.0010           | 0.4418             | -0.0064         | -21.5541       | -20.8103     | 14.2502         | 14.2528       |
+| 0.6937        | 0.68  | 700  | 0.6939          | 0.0015         | 0.0020           | 0.5275             | -0.0005         | -21.5508       | -20.7871     | 14.2547         | 14.2573       |
+| 0.6855        | 0.78  | 800  | 0.6933          | -0.0008        | -0.0017          | 0.5099             | 0.0009          | -21.5631       | -20.7947     | 14.2522         | 14.2548       |
+| 0.6918        | 0.88  | 900  | 0.6933          | -0.0008        | -0.0019          | 0.5187             | 0.0011          | -21.5638       | -20.7947     | 14.2524         | 14.2550       |
+| 0.6957        | 0.98  | 1000 | 0.6933          | -0.0008        | -0.0019          | 0.5187             | 0.0011          | -21.5638       | -20.7947     | 14.2524         | 14.2550       |
 ### Framework versions

final_checkpoint/model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f85dd251dd626fdb73ead11607afecd33b2eccbbd0188a354b7b72ebf539fd29
 size 4976746424

 version https://git-lfs.github.com/spec/v1
+oid sha256:18bd2f664b12d6db09cf833656ae9a726999112b6713494e5b0db5937bad20e3
 size 4976746424

final_checkpoint/model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:680fc811f2d4e36b7ed53f83f83770b202ef7146af87f569f7c4d75f6b229d12
 size 4966260992

 version https://git-lfs.github.com/spec/v1
+oid sha256:1d35ccdd8c1b28bfbf54f541d22882e3ea454d4fe2aa26cf7b115d7b0fe1db8a
 size 4966260992

final_checkpoint/model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:675d04898fb6d1bf9f999b7566532ab87f31bb2411a1a0be6679bcba94b252e9
 size 3355588232

 version https://git-lfs.github.com/spec/v1
+oid sha256:27c75b35f7faa707a36c75a3226633f4a41ea0d492ddb5d897c3bbf74506dc0f
 size 3355588232

model-00001-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f85dd251dd626fdb73ead11607afecd33b2eccbbd0188a354b7b72ebf539fd29
 size 4976746424

 version https://git-lfs.github.com/spec/v1
+oid sha256:18bd2f664b12d6db09cf833656ae9a726999112b6713494e5b0db5937bad20e3
 size 4976746424

model-00002-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:680fc811f2d4e36b7ed53f83f83770b202ef7146af87f569f7c4d75f6b229d12
 size 4966260992

 version https://git-lfs.github.com/spec/v1
+oid sha256:1d35ccdd8c1b28bfbf54f541d22882e3ea454d4fe2aa26cf7b115d7b0fe1db8a
 size 4966260992

model-00003-of-00003.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:675d04898fb6d1bf9f999b7566532ab87f31bb2411a1a0be6679bcba94b252e9
 size 3355588232

 version https://git-lfs.github.com/spec/v1
+oid sha256:27c75b35f7faa707a36c75a3226633f4a41ea0d492ddb5d897c3bbf74506dc0f
 size 3355588232

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2b6321d2a9e526777d6d183b79b35ef1b3d7a41179b3b7f9063c490d45fa8bae
 size 4475

 version https://git-lfs.github.com/spec/v1
+oid sha256:50b91ebb5dc97f9d92c9f28965a02db4b8cbe8d571b55177f03beb2199631458
 size 4475