Tokenizer Details

by qingy2024 - opened 5 days ago

5 days ago

Great work on this!

I was wondering, how did you get the tokenizer to have the same vocab size as QwQ 32B Preview? I would like to do this for some other models too!

If you have a script or just a set of steps to do this, I'd appreciate if you could share it :)

Owner 5 days ago

The tokenizer is actually the same, you only need to change the embedding layer size.

model.resize_token_embeddings(152064)

tugstugi changed discussion status to closed 5 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment