Hugging Face Model - Bengali Finetuned

This repository contains a Hugging Face model that has been fine-tuned on a Bengali dataset. The model uses the peft library for generating responses.

Usage

To use the model, first import the necessary libraries:

from peft import PeftModel
from transformers import LlamaTokenizer, LlamaForCausalLM, GenerationConfig

Next, load the tokenizer and model:

tokenizer = LlamaTokenizer.from_pretrained("yahma/llama-7b-hf")
model = LlamaForCausalLM.from_pretrained(
    "yahma/llama-7b-hf",
    load_in_8bit=True,
    device_map="auto",
)

Then, load the PeftModel with the specified pre-trained model and path to the peft model:

model = PeftModel.from_pretrained(model, "./bengali-dolly-alpaca-lora-7b")

Next, define a function to generate a prompt:

def generate_prompt(instruction, input=None):
    if input:
        return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{input}

### Response:"""
    else:
        return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:"""

Finally, define a function to evaluate the model:

generation_config = GenerationConfig(
    temperature=0.1,
    top_p=0.75,
    num_beams=4,
)

def evaluate(model, instruction, input=None):
    prompt = generate_prompt(instruction, input)
    inputs = tokenizer(prompt, return_tensors="pt")
    input_ids = inputs["input_ids"].cuda()
    generation_output = model.generate(
        input_ids=input_ids,
        generation_config=generation_config,
        return_dict_in_generate=True,
        output_scores=True,
        max_new_tokens=256
    )
    for s in generation_output.sequences:
        output = tokenizer.decode(s)
        print("Response:", output.split("### Response:")[1].strip())

instruct =input("Instruction: ")
evaluate(model, instruct)

To generate a response, simply run the evaluate function with an instruction and optional input:

instruct = "Write a response that appropriately completes the request."
input = "This is a sample input."
evaluate(model, instruct, input)

This will output a response that completes the request.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.