How to set `tokenizer.chat_template` to an appropriate template using Gemma-2b
Anyone else having problems defining the chat template for the tokenizer?
I've been trying for 2 days and the following error only occurs:
"No chat template is defined for this tokenizer - using a default chat template that implements the ChatML format (without BOS/EOS tokens!). If the default is not appropriate for your model, please set tokenizer.chat_template
to an appropriate template. See https://huggingface.co./docs/transformers/main/chat_templating for more information."
Code used:
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model_id = "google/gemma-2b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
token=""
)
chat = [ { "role": "user", "content": "Write a hello world program" } ]
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150)
print(outputs)
I believe the chat template is only applicable for the instruction tuned versions, as those have been trained with a specific template. The base models are not specifically trained with any template