license: mit
language:
- zh
- en
datasets:
- dzunggg/legal-qa-v1
- ibunescu/qa_legal_dataset_train
metrics:
- accuracy
pipeline_tag: text-generation
library_name: adapter-transformers
tags:
- legal
LLaMA3-8B-Lawyer
This project involves fine-tuning the LLaMA3-8B model using the dzunggg/legal-qa-v1
dataset. The fine-tuning was conducted with the LLaMA-Factory toolkit on a single NVIDIA L20-48G GPU. The fine-tuned model has been uploaded to Hugging Face and is available at StevenChen16/llama3-8b-Lawyer.
Project Overview
The primary goal of this project was to create a high-performing legal question-answering model based on LLaMA3-8B. By leveraging the dzunggg/legal-qa-v1
dataset and the capabilities of LLaMA-Factory, we were able to fine-tune the model effectively. The AI model can function like a lawyer, asking detailed questions about the case background and making judgments based on the provided information.
Fine-Tuning Details
Model
- Base Model:
nvidia/Llama3-ChatQA-1.5-8B
- Fine-Tuned Model:
StevenChen16/llama3-8b-Lawyer
Dataset
- Dataset Used:
dzunggg/legal-qa-v1
Training Configuration
args = dict(
stage="sft", # do supervised fine-tuning
do_train=True,
model_name_or_path="nvidia/Llama3-ChatQA-1.5-8B", # use bnb-4bit-quantized Llama-3-8B-Instruct model
dataset="legal_qa_v1_train", # use legal_qa_v1_train dataset
template="llama3", # use llama3 prompt template
finetuning_type="lora", # use LoRA adapters to save memory
lora_target="all", # attach LoRA adapters to all linear layers
output_dir="llama3_lora", # the path to save LoRA adapters
per_device_train_batch_size=8, # the batch size
gradient_accumulation_steps=6, # the gradient accumulation steps
lr_scheduler_type="cosine", # use cosine learning rate scheduler
logging_steps=10, # log every 10 steps
warmup_ratio=0.1, # use warmup scheduler
save_steps=1000, # save checkpoint every 1000 steps
learning_rate=1e-4, # the learning rate
num_train_epochs=10.0, # the epochs of training
max_samples=500, # use 500 examples in each dataset
max_grad_norm=1.0, # clip gradient norm to 1.0
quantization_bit=8, # use 8-bit quantization
loraplus_lr_ratio=16.0, # use LoRA+ algorithm with lambda=16.0
use_unsloth=True, # use UnslothAI's LoRA optimization for 2x faster training
fp16=True, # use float16 mixed precision training
overwrite_output_dir=True,
)
Hardware
- GPU: NVIDIA L20-48G
Usage
You can load and use the fine-tuned model from Hugging Face as follows:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "StevenChen16/llama3-8b-Lawyer"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
# Example usage
input_text = "Your legal question here."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Example Interaction
The model can engage in a detailed interaction, simulating the behavior of a lawyer. Provide the case background, and the model will ask for more details to make informed judgments.
Example
input_text = "I have a contract dispute where the other party did not deliver the promised goods."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Output:
Can you provide more details about the contract terms and the goods that were supposed to be delivered? Were there any specific deadlines mentioned in the contract?
Training Notebook and Repository
- Training Notebook: Google Colab Notebook
- GitHub Repository: lawyer-llama3-8b
Results
The fine-tuned model has shown promising results in understanding and answering legal questions. By leveraging advanced techniques such as LoRA and UnslothAI optimizations, the training process was efficient and effective, ensuring a high-quality model output.
Acknowledgements
- LLaMA-Factory
- Dataset:
dzunggg/legal-qa-v1
- Base Model:
nvidia/Llama3-ChatQA-1.5-8B
- Hosted on Hugging Face
License
This project is licensed under the MIT License. See the LICENSE file for details.