File size: 3,794 Bytes
2fcb419 7b70c23 2fcb419 8a77f3c 2fcb419 b6376b7 2fcb419 b6376b7 8a77f3c 2fcb419 8a77f3c 2fcb419 8a77f3c 2fcb419 8a77f3c 2fcb419 8a77f3c 2fcb419 8a77f3c 184544d 8a77f3c f34efe3 f6f168f f34efe3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
---
base_model: meta-llama/Llama-3.2-1B
library_name: peft
datasets:
- cjziems/Article-Bias-Prediction
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- Bias
- News
- Articles
- Political
---
# **Llama-3.2-1B (Political Bias Detection)**
## **Overview**
This model is designed to detect potential political bias in news articles. Given a text passage (e.g., a news article), the model returns probabilities indicating whether the text is leaning to the *Left*, *Center*, or *Right* of the political spectrum.
## **Model Description**
### **Model Architecture**
- **Base Model**: [meta-llama/Llama-3.2-1B](https://huggingface.co./meta-llama/Llama-3.2-1B)
- **Adapters**: LoRA (Low-Rank Adaptation)
- **Precision**: 4-bit quantization enabled for efficient inference and training (with nested/double quantization).
### **Intended Use**
- **Primary**: Provide a text of a news article, the model outputs probabilities corresponding to three political bias labels:
- **LABEL_0**: Left
- **LABEL_1**: Center
- **LABEL_2**: Right
- **Usage Scenarios**:
- Media research and analytics
- Automated or semi-automated political bias detection in digital news
- Educational or journalistic explorations of bias
> **Note**: This model is *not* an authoritative arbiter of political bias. It can be used as a *supplementary* tool to help flag potential leanings.
---
## **How to Use**
Below is a sample code snippet demonstrating how to load the model and apply LoRA adapters for classification:
```python
import transformers
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel
# 1. Load the *base* LLaMA model for sequence classification
base_model_name = "meta-llama/Llama-3.2-1B"
access_token = "YOUR_HF_ACCESS_TOKEN" # If needed
model = AutoModelForSequenceClassification.from_pretrained(
base_model_name,
use_auth_token=access_token,
num_labels=3,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
# 2. Load the LoRA adapter on top of the base model
adapter_path = "tzoulio/news-bias-finder-llama-3.2-1B"
model = PeftModel.from_pretrained(model, adapter_path)
# 3. Create the pipeline with the specified model and tokenizer
pipeline = transformers.pipeline(
"text-classification",
model=model,
tokenizer=tokenizer
)
# Example usage
text = "Insert the news article text here..."
prediction = pipeline(text)
print(prediction)
```
### **Input / Output Details**
**Input**: A single string containing the text of a news article.
**Output**: A list of dictionaries, where each dictionary contains:
- "label": The predicted label (e.g., "LABEL_2")
- "score": The probability for that label.
```css
Example Output: [[{"LABEL_0": 0.23, "LABEL_1": 0.30, "LABEL_2": 0.47}]]
Indicates 23% chance of Left, 30% chance of Center, 47% chance of Right.
```
## **Training & Fine-tuning**
### **Dataset Sizes**
- **Training Set**: 17,984 examples
- **Evaluation Set**: 4,496 examples
- **Test Set**: 5,620 examples
### **Hyperparameters and Important Settings**
```python
# Precision & Quantization
load_in_4bit = True
bnb_4bit_use_double_quant = True
bnb_4bit_quant_type = "nf4"
bnb_4bit_compute_dtype = torch.bfloat16
# LoRA Configuration
lora_r = 16
lora_alpha = 64
lora_dropout = 0.1
bias = "none"
# Task Type
task_type = "SEQ_CLS"
# Training Setup
per_device_train_batch_size = 4
gradient_accumulation_steps = 4
learning_rate = 2e-4
optim = "paged_adamw_32bit"
num_train_epochs = 3
warmup_steps = 2
fp16 = True
logging_steps = 1
```
## **Evaluation**
### **Metrics**
We report the F1-score on each dataset split.
## **Results**
- F1-Score (Training): 0.96658
- F1-Score (Eval) : 0.96664
- F1-Score (Test) : 0.96299
|