ValueError: Tokenizer class Arcade100kTokenizer does not exist or is not currently imported.
#7
by
interstellarninja
- opened
I have trained a qlora with stablelm-2-zephyr-1_6b and I'm trying to inference the merged model. I have also downloaded the tokenization.arcade100k.py into the merged folder but i still get the error with code below:
self.bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
)
self.model = AutoModelForCausalLM.from_pretrained(
model_path,
trust_remote_code=True,
return_dict=True,
quantization_config=self.bnb_config,
torch_dtype=torch.bfloat16,
device_map="auto",
)
self.tokenizer = AutoTokenizer.from_pretrained(model_path)
self.tokenizer.pad_token = self.tokenizer.eos_token
self.tokenizer.padding_side = "left"
try this
self.tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
thanks g-ronimo but i'm using the local merged qlora model:
btw this worked importing Arcade100kTokenizer into inference code:
from tokenization_arcade100k import Arcade100kTokenizer
self.tokenizer = Arcade100kTokenizer.from_pretrained(model_path)
Hi,
@interstellarninja
π You still need to pass trust_remote_code=True
to the AutoTokenizer.from_pretrained
method even if files are local because of the custom tokenizer implementation. See relevant code here.
jon-tow
changed discussion status to
closed