Vision tokens missing from chat template
Qwen 2 VL uses the messages to insert special vision tokens in the prompt, but this logic is missing from the Qwen 2.5 VL chat template. Should the current chat template be replaced with the Qwen 2 VL chat template?
Yes, it can be operated in this way.
@bluelike , could you please fix the chat template in tokenizer_config.json for all the Qwen 2.5 VL models? Many apps rely on this template instead of supplying their own.
Is there a reason the vision logic was left out of the chat template for this model? This means apps that use the model have to supply their own chat template if they want to use the vision capabilities.
To clarify: The chat template in tokenizer_config.json does not include the vision logic, but the one in chat_template.json does. This caused a problem for Swift MLX users, since swift-transformers was using the template from tokenizer_config.json. I still think the template in tokenizer_config.json should be replaced with the one in chat_template.json, to avoid problems like the one we encountered.