FactAlign
Collection
Models and datasets of our EMNLP 2024 paper "FactAlign: Long-form Factuality Alignment of Large Language Models"
•
7 items
•
Updated
•
1
This model is aligned with our FactAlign framework for improved long-form factuality, from microsoft/Phi-3-mini-4k-instruct.
For more information, please refer to our paper: FactAlign: Long-form Factuality Alignment of Large Language Models.
More information needed
More information needed
This model is a fine-tuned version of microsoft/Phi-3-mini-4k-instruct on the trl-lib/kto-mix-14k and the chaoweihuang/lf-response-phi3-f1_100_0.7-fg0.5 datasets. It achieves the following results on the evaluation set:
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Logps/chosen | Rewards/rejected | Logps/rejected | Rewards/margins | Kl | Fg Kl | Fg Rewards/chosen Sum | Fg Logps/policy Chosen | Fg Logps/reference Chosen | Count/fg Chosen | Fg Rewards/rejected Sum | Fg Logps/policy Rejected | Fg Logps/reference Rejected | Count/fg Rejected | Fg Logps/policy Kl | Fg Logps/reference Kl | Fg Loss |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0.4495 | 0.4103 | 400 | 0.4978 | -1.0397 | -303.5076 | -2.7182 | -365.1212 | 1.6785 | 0.0054 | nan | -1.3184 | -16.1070 | -14.9295 | 16.0137 | -0.5732 | -20.2671 | -18.7868 | 4.0824 | -21.1826 | -20.2070 | 0.7449 |
0.5189 | 0.8206 | 800 | 0.4815 | -0.6601 | -299.7121 | -2.6435 | -364.3744 | 1.9834 | 0.0081 | nan | 0.0694 | -15.2781 | -14.9295 | 16.0137 | -0.3623 | -19.6552 | -18.7868 | 4.0824 | -21.1260 | -20.2070 | 0.7365 |
Base model
microsoft/Phi-3-mini-4k-instruct