nghiemhnlp/HateCOT_LLAMA_7B

Update

The paper has been accepted to EMNLP 2024 Findings: https://aclanthology.org/2024.findings-emnlp.343/

Introduction

This is the LoRA-adapater for the Llama-7B introduced in the paper HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models. The base model is instruction-finetuned on 52,000 samples that includes augmented humman annotation to produce legible explanations based on predefined criteria in the provided definition.

To use the model, please load along with the original Llama model (detailed configuration in the Training Procedure). For instruction to load Peft models: https://huggingface.co./docs/transformers/main/en/peft

These adapters can also be finetuned on a new set of data. See the article for more details.

Usage

Use the following template to prompt the model:

### Instruction
Perform this task by considering the following Definitions.
Based on the message, label the input as only one of the following categories:
[Class 1], [Class 2], ..., or [Class N].
Provide a brief paragraph to explain step-by-step why the post should be classsified
with the provided Label based on the given Definitions. If this post targets a group or
entity relevant to the definition of the specified Label, explain who this target is and how
that leads to that Label.
Append the string '<END>' to the end of your response. Provide your response in the following format:
EXPLANATION: [text]
LABEL:[text] <END>
### Definitions:
[Class 1]: [Definition 1]
[Class 2]: [Definition 2]
...
[Class N]: [Definition 3]
### Input
{post}
### Response:

Citation

@article{nghiem2024hatecot,
  title={HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models},
  author={Nghiem, Huy and Daum{\'e} III, Hal},
 journal={arXiv preprint arXiv:2403.11456},
  year={2024}
}

Original Model

Please visit the main repository to gain permission to download original model weights.

https://huggingface.co./meta-llama

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: bitsandbytes
load_in_8bit: True
load_in_4bit: False
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float16

Framework versions

PEFT 0.5.0