Chat-Summarization / README.md
dnzblgn's picture
Update README.md
a65cba8 verified
metadata
license: apache-2.0
language:
  - en
base_model:
  - google-t5/t5-base
pipeline_tag: summarization

Model Name: LoRA Fine-Tuned Model for Dialogue Summarization
Model Type: Seq2Seq with Low-Rank Adaptation (LoRA)
Base Model: google/t5-base

Model Details

  • Architecture: T5-base
  • Finetuning Technique: LoRA (Low-Rank Adaptation)
  • PEFT Method: Parameter Efficient Fine-Tuning
  • Data: samsumdataset
  • Metrics: Evaluated using ROUGE (ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum)

Intended Use

This model is designed for summarizing dialogues, such as conversations between individuals in a chat or messaging context. It’s suitable for applications in:

  • Customer Service: Summarizing chat logs for quality monitoring or training.
  • Messaging Apps: Generating conversation summaries for user convenience.
  • Content Creation: Assisting writers by summarizing character dialogues.

Training Process

Optimizer: AdamW with learning rate 3e-5

Batch Size: 4 (gradient accumulation steps of 2)

Training Epochs: 2

Evaluation Metrics: ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum

Hardware: Trained on a single GPU with mixed precision to optimize performance.

The model was trained using the Seq2SeqTrainer class from transformers, with LoRA parameters applied to selected attention layers to reduce computation without compromising accuracy.