
HuggingFaceTB/SmolVLM-Instruct
Image-Text-to-Text
•
Updated
•
77.9k
•
406
State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct
Generate text responses using images and text prompts