Update post-processor to add bos
#41
by
pcuenq
HF staff
- opened
No description provided.
@pcuenq It looks like this might be missing the end of turn token:
Edit: I'm dumb, this comment should be for the instruct model https://huggingface.co./meta-llama/Meta-Llama-3-8B-Instruct/discussions/42/files Sorry!
https://github.com/meta-llama/llama3/blob/main/llama/generation.py#L307
->
https://github.com/meta-llama/llama3/blob/main/llama/tokenizer.py#L222
def encode_message(self, message: Message) -> List[int]:
tokens = self.encode_header(message)
tokens.extend(
self.tokenizer.encode(message["content"].strip(), bos=False, eos=False)
)
tokens.append(self.tokenizer.special_tokens["<|eot_id|>"])
return tokens
It looks like at the end of each message, the eot should be appended if I'm reading this right.
I tested this change, and it fixes fine-tuning of the base model. Without it the grad norm is inf and the loss is high.
I also tried just using add_bos_token: true
and that did not actually add the token, at least with Axolotl.
Thanks for the confirmations, merging now!
pcuenq
changed pull request status to
merged