Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion Paper โข 2412.04424 โข Published 20 days ago โข 55
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper โข 2412.03555 โข Published 21 days ago โข 118
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. โข 23 items โข Updated 12 days ago โข 119
EchoPrime: A Multi-Video View-Informed Vision-Language Model for Comprehensive Echocardiography Interpretation Paper โข 2410.09704 โข Published Oct 13 โข 11