metadata
library_name: transformers
license: other
base_model: meta-llama/Llama-3.2-1B
tags:
- llama-factory
- full
- generated_from_trainer
model-index:
- name: GuardReasoner 1B
results: []
pipeline_tag: text-classification
language:
- en
metrics:
- f1
GuardReasoner 1B
This model is a fine-tuned version of meta-llama/Llama-3.2-1B via R-SFT and HS-DPO. It is based on the paper GuardReasoner: Towards Reasoning-based LLM Safeguards.
The training data of R-SFT can be found in GuardReasonerTrain.
@article{GuardReasoner,
title={GuardReasoner: Towards Reasoning-based LLM Safeguards},
author={Liu, Yue and Gao, Hongcheng and Zhai, Shengfang and Jun, Xia and Wu, Tianyi and Xue, Zhiwei and Chen, Yulin and Kawaguchi, Kenji and Zhang, Jiaheng and Hooi, Bryan},
journal={arXiv preprint arXiv:2501.18492},
year={2025}
}