roberta-large-fallacy-classification

This model is a fine-tuned version of roberta-large trained for logical fallacy detection on the Logical Fallacy Dataset. It is capable of classifying various types of logical fallacies in text.

Model Details

Base Model: roberta-large
Dataset: Logical Fallacy Dataset
Number of Classes: 13
Training Parameters:
- Learning Rate: 5e-6 with cosine decay scheduler
- Batch Size: 8 (with gradient accumulation for effective batch size of 16)
- Weight Decay: 0.3
- Label Smoothing: 0.1
- Mixed Precision (FP16): Enabled
- Early Stopping: Used with patience of 2 epochs
Training Time: Approximately 10 epochs

Example Pipeline

To use the model for quick classification with a text pipeline:

from transformers import pipeline
import torch

# Initialize the text classification pipeline with the specified model and tokenizer
model_path = "MidhunKanadan/roberta-large-fallacy-classification"
pipe = pipeline("text-classification", model=model_path, tokenizer=model_path, use_fast=False, device=0 if torch.cuda.is_available() else -1)

# Sample text to analyze for logical fallacies
text = "The rooster crows always before the sun rises, therefore the crowing rooster causes the sun to rise."
result = pipe(text)[0]  # Retrieve the first result (main prediction)

# Output the predicted label and confidence score
print(f"Predicted Label: {result['label']}\nScore: {result['score']:.4f}")

Expected Output:

Predicted Label: false causality
Score: 0.8938

Full Classification Example

For more control, load the model and tokenizer directly and perform classification:

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model and tokenizer, set device
model_path = "MidhunKanadan/roberta-large-fallacy-classification"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = AutoModelForSequenceClassification.from_pretrained(model_path).to(device).eval()
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)

# Tokenize input and get probabilities
text = "The rooster crows always before the sun rises, therefore the crowing rooster causes the sun to rise."
inputs = tokenizer(text, return_tensors="pt").to(device)
with torch.no_grad():
    probabilities = torch.nn.functional.softmax(model(**inputs).logits, dim=1)[0]

# Output sorted results
for label, score in sorted(zip(model.config.id2label.values(), probabilities), key=lambda x: x[1], reverse=True):
    print(f"{label}: {score.item():.4f}")

Expected Output:

false causality: 0.8938
fallacy of logic: 0.0366
circular reasoning: 0.0127
equivocation: 0.0121
faulty generalization: 0.0104
fallacy of relevance: 0.0059
false dilemma: 0.0053
ad populum: 0.0053
fallacy of extension: 0.0042
fallacy of credibility: 0.0040
appeal to emotion: 0.0037
intentional: 0.0036
ad hominem: 0.0025

Dataset

Dataset Name: Logical Fallacy Dataset
Source: Logical Fallacy Dataset
Number of Classes: 13 fallacies (e.g., ad hominem, appeal to emotion, faulty generalization, etc.)

MidhunKanadan
/

roberta-large-fallacy-classification

roberta-large-fallacy-classification

Model Details

Example Pipeline

Full Classification Example

Dataset

Model tree for MidhunKanadan/roberta-large-fallacy-classification

Dataset used to train MidhunKanadan/roberta-large-fallacy-classification