Linear Scaled RoPE LLama LoRA 16k
import torch
from transformers import LlamaTokenizerFast, AutoModelForCausalLM
model_name = "jordiclive/scaled-llama-7b-lora-16k-rp2"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,trust_remote_code=True
)
tokenizer = LlamaTokenizerFast.from_pretrained(
model_name)
tokenizer.model_max_length = 16384
tokenizer.pad_token = tokenizer.eos_token
model.max_sequence_length = tokenizer.model_max_length
huggyllama/llama-7b
Trained on Packed 16k sequences of the RedPajama dataset for 1 Epoch.- Merged Model. If require LoRA parameters/config, they are in the
adapter
folder.
- Downloads last month
- 19
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.