Vision tokens missing from chat template

#11

by depasquale - opened 26 days ago

26 days ago

Qwen 2 VL uses the messages to insert special vision tokens in the prompt, but this logic is missing from the Qwen 2.5 VL chat template. Should the current chat template be replaced with the Qwen 2 VL chat template?

bluelike

Qwen org 23 days ago

Yes, it can be operated in this way.

depasquale

21 days ago

@bluelike , could you please fix the chat template in tokenizer_config.json for all the Qwen 2.5 VL models? Many apps rely on this template instead of supplying their own.

depasquale

19 days ago

Is there a reason the vision logic was left out of the chat template for this model? This means apps that use the model have to supply their own chat template if they want to use the vision capabilities.

depasquale

10 days ago

To clarify: The chat template in tokenizer_config.json does not include the vision logic, but the one in chat_template.json does. This caused a problem for Swift MLX users, since swift-transformers was using the template from tokenizer_config.json. I still think the template in tokenizer_config.json should be replaced with the one in chat_template.json, to avoid problems like the one we encountered.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment