Literary Classicist LLaMA 3 QLoRA

This repository contains the fine-tuned LLaMA 3 model, adapted with the QLoRA methodology, to generate text based on prompts in the style of literary classics.

Model Details

  • Base Model: Meta-LLaMA-3-8B
  • Fine-Tuned Using: QLoRA (Parameter Efficient Fine-Tuning)
  • Task: Causal Language Modelling (Text Generation)

Installation

To use this model, ensure you have the transformers library installed. You can install it via pip:

pip install transformers

For GPU inference, it is also recommended to install torch with CUDA support:

pip install torch

Loading the Model

To load the model and tokenizer for inference, use the following Python code:

from transformers import AutoTokenizer, AutoModelForCausalLM

# Replace with the name of this repository
model_name = "XiWangEric/literary-classicist-llama3-qlora"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Load the model
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")

# Set the model to evaluation mode
model.eval()

Running Inference

Here is an example of how to use the model for generating text based on a prompt:

# Define your input prompt
input_text = "Once upon a time in a faraway land,"

# Tokenize the input and prepare for inference
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")  # Move tensors to the GPU

# Generate text
outputs = model.generate(**inputs, max_length=50, do_sample=True, top_k=50, top_p=0.95)

# Decode the output
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Generated Text:", generated_text)

Parameters for Generation

You can customise the text generation using the following parameters:

  • max_length: The maximum length of the generated sequence.
  • do_sample: Whether to sample the next token or pick the most probable one.
  • top_k: The number of highest probability vocabulary tokens to keep for sampling.
  • top_p: The cumulative probability threshold for nucleus sampling.

For example:

outputs = model.generate(
    **inputs,
    max_length=100,
    do_sample=True,
    top_k=40,
    top_p=0.9
)

Example Output

For the input prompt:

Once upon a time in a faraway land,

The model might generate:

Once upon a time in a faraway land, there was a beautiful castle surrounded by an enchanted forest. The villagers spoke of a hidden treasure deep within the woods, guarded by a magical creature of legend.
Downloads last month
21
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
Unable to determine this model's library. Check the docs .

Model tree for XiWangEric/literary-classicist-llama3-qlora

Finetuned
(373)
this model