-
ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration
Paper • 2411.16044 • Published • 1 -
OmChat: A Recipe to Train Multimodal Language Models with Strong Long Context and Video Understanding
Paper • 2407.04923 • Published • 1 -
OmDet: Large-scale vision-language multi-dataset pre-training with multimodal detection network
Paper • 2209.05946 • Published • 1 -
VL-CheckList: Evaluating Pre-trained Vision-Language Models with Objects, Attributes and Relations
Paper • 2207.00221 • Published • 1
Om AI Lab
company
AI & ML interests
Multimodal AI
Recent Activity
View all activity
Organization Card
Om AI Lab is a passionate group that is creating multimodal agents that reshape our work and life.