vdpappu
/

lora_coding_assistant

Question Answering

Model card Files Files and versions Community

lora_coding_assistant / README.md

vdpappu's picture

Update README.md

3359ffb verified 4 months ago

|

history blame contribute delete

2 kB

	---
	base_model: google/gemma-2b
	library_name: peft
	license: apache-2.0
	datasets:
	- iamtarun/python_code_instructions_18k_alpaca
	language:
	- en
	pipeline_tag: question-answering
	tags:
	- finance
	---
	# Model Card for Model ID
	A Gemma-2b finetuned LoRA trained on science Q&A
	- Developed by: Venkat

	<!-- Provide the basic links for the model. -->


	## How to Get Started with the Model
	```
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
	from peft import PeftModel
	from typing import Optional
	import time
	import os

	def generate_prompt(input_text: str, instruction: Optional[str] = None) -> str:
	text = f"### Question: {input_text}\n\n### Answer: "
	if instruction:
	text = f"### Instruction: {instruction}\n\n{text}"
	return text

	huggingface_token = os.environ.get('HUGGINGFACE_TOKEN')

	base_model = AutoModelForCausalLM.from_pretrained("google/gemma-2b", token=huggingface_token)
	tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b", token=huggingface_token)

	lora_model = PeftModel.from_pretrained(base_model, "vdpappu/lora_coding_assistant")
	merged_model = lora_model.merge_and_unload()

	eos_token = '<eos>'
	eos_token_id = tokenizer.encode(eos_token, add_special_tokens=False)[-1]

	generation_config = GenerationConfig(
	eos_token_id=tokenizer.eos_token_id,
	min_length=5,
	max_length=200,
	do_sample=True,
	temperature=0.7,
	top_p=0.9,
	top_k=50,
	repetition_penalty=1.5,
	no_repeat_ngram_size=3,
	early_stopping=True
	)

	question = "Develop a Python program to clearly understand the concept of recursion."
	prompt = generate_prompt(input_text=question)

	with torch.no_grad():
	inputs = tokenizer(prompt, return_tensors="pt")
	output = merged_model.generate(**inputs, generation_config=generation_config)
	response = tokenizer.decode(output[0], skip_special_tokens=True)

	print(f"Inference time: {end-start:.2f} seconds")
	print(response)
	```

	- PEFT 0.12.0