Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
prithivMLmods 
posted an update about 14 hours ago
Post
681
Gemma-3-4B : Image and Video Inference 🖼️🎥

🧤Space: prithivMLmods/Imagineo-Chat

@gemma3-4b : {Tag + Space_+ 'prompt'}
@gemma3-4b-video : {Tag + Space_+ 'prompt'}
By default, it runs: prithivMLmods/Qwen2-VL-OCR-2B-Instruct

Additionally, I have also tested Aya-Vision 8B vs Custom Qwen2-VL-OCR for OCR with test case samples on messy handwriting for experimental purposes to optimize edge device VLMs for Optical Character Recognition.

📜Read the blog here: https://huggingface.co./blog/prithivMLmods/aya-vision-vs-qwen2vl-ocr-2b