Llama-2-7B-Chat Fine-Tuned Model
This model is a fine-tuned version of Llama-2-7B-Chat model, optimized for instruction-following tasks. It has been trained on the mlabonne/guanaco-llama2-1k
dataset and is optimized for efficient text generation across various NLP tasks, including question answering, summarization, and text completion.
Model Details
- Base Model: NousResearch/Llama-2-7b-chat-hf
- Fine-Tuning Task: Instruction-following
- Training Dataset: mlabonne/guanaco-llama2-1k
- Optimized For: Text generation, question answering, summarization, and more.
- Fine-Tuned Parameters:
- LoRA (Low-Rank Adaption) applied for efficient training with smaller parameter updates.
- Quantized to 4-bit for memory efficiency and better GPU utilization.
- Training includes gradient accumulation, gradient checkpointing, and weight decay to prevent overfitting and enhance memory efficiency.
Usage
You can use this fine-tuned model with the Hugging Face transformers
library. Below is an example of how to load and use the model for text generation.
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load pre-trained model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("https://huggingface.co./devshaheen/llama-2-7b-chat-finetune")
model = AutoModelForCausalLM.from_pretrained("https://huggingface.co./devshaheen/llama-2-7b-chat-finetune")
# Example text generation
input_text = "What is the capital of France?"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
- Downloads last month
- 3
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for devshaheen/Llama-2-7b-chat-finetune
Base model
NousResearch/Llama-2-7b-chat-hf