mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models Paper • 2408.04840 • Published Aug 9 • 32
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16 • 98