File size: 4,685 Bytes
9b4ffd7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
license: other
base_model: microsoft/phi-1_5
tags:
- generated_from_trainer
model-index:
- name: results
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# results

This model is a fine-tuned version of [microsoft/phi-1_5](https://huggingface.co./microsoft/phi-1_5) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0001
- Rewards/chosen: -7.5874
- Rewards/rejected: -24.0497
- Rewards/accuracies: 1.0
- Rewards/margins: 16.4623
- Logps/rejected: -274.3435
- Logps/chosen: -143.2090
- Logits/rejected: -1.8100
- Logits/chosen: -1.4786

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 4
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- training_steps: 1500

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.0417        | 0.07  | 100  | 0.0418          | -0.3892        | -8.0118          | 0.9792             | 7.6226          | -113.9640      | -71.2264     | 1.8258          | 1.7898        |
| 0.0221        | 0.15  | 200  | 0.0303          | -2.5657        | -10.9212         | 0.9896             | 8.3555          | -143.0585      | -92.9920     | 1.9704          | 2.1047        |
| 0.0107        | 0.22  | 300  | 0.0131          | -1.7388        | -11.6047         | 0.9965             | 9.8659          | -149.8935      | -84.7232     | 1.0731          | 0.9750        |
| 0.0204        | 0.29  | 400  | 0.0108          | -2.0131        | -11.9647         | 0.9965             | 9.9516          | -153.4932      | -87.4658     | 1.3610          | 1.6740        |
| 0.0067        | 0.36  | 500  | 0.0080          | -5.9488        | -19.6561         | 0.9974             | 13.7073         | -230.4076      | -126.8228    | -0.4464         | -0.2114       |
| 0.0           | 0.44  | 600  | 0.0047          | -5.6456        | -20.2381         | 0.9983             | 14.5924         | -236.2268      | -123.7909    | -0.4142         | -0.0244       |
| 0.0003        | 0.51  | 700  | 0.0018          | -7.2250        | -21.3351         | 0.9991             | 14.1101         | -247.1974      | -139.5853    | -0.3510         | -0.0203       |
| 0.0005        | 0.58  | 800  | 0.0008          | -7.2263        | -21.2475         | 0.9991             | 14.0211         | -246.3209      | -139.5981    | -0.8673         | -0.7010       |
| 0.0           | 0.66  | 900  | 0.0009          | -10.2371       | -26.0402         | 0.9991             | 15.8031         | -294.2486      | -169.7062    | -1.9784         | -1.7799       |
| 0.0           | 0.73  | 1000 | 0.0008          | -5.9544        | -22.0767         | 0.9991             | 16.1223         | -254.6137      | -126.8789    | -1.0623         | -0.6039       |
| 0.0           | 0.8   | 1100 | 0.0007          | -7.3374        | -23.8700         | 0.9991             | 16.5327         | -272.5467      | -140.7083    | -1.5517         | -1.1710       |
| 0.0           | 0.87  | 1200 | 0.0007          | -7.6398        | -24.1605         | 0.9991             | 16.5207         | -275.4509      | -143.7327    | -1.8124         | -1.4901       |
| 0.0           | 0.95  | 1300 | 0.0001          | -7.5920        | -24.0476         | 1.0                | 16.4556         | -274.3220      | -143.2550    | -1.8115         | -1.4816       |
| 0.0001        | 1.02  | 1400 | 0.0001          | -7.5872        | -24.0480         | 1.0                | 16.4608         | -274.3262      | -143.2065    | -1.8102         | -1.4791       |
| 0.0           | 1.09  | 1500 | 0.0001          | -7.5874        | -24.0497         | 1.0                | 16.4623         | -274.3435      | -143.2090    | -1.8100         | -1.4786       |


### Framework versions

- Transformers 4.33.2
- Pytorch 2.0.1+cu118
- Datasets 2.14.5
- Tokenizers 0.13.3