NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering Paper • 2502.10868 • Published 25 days ago • 2
ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents Paper • 2502.18017 • Published 16 days ago • 18
Scalable Vision Language Model Training via High Quality Data Curation Paper • 2501.05952 • Published Jan 10 • 1
ColQwen2 Models Collection Pre-trained checkpoints for the ColQwen2 model. • 4 items • Updated Jan 23 • 4
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 8 items • Updated 17 days ago • 394
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning Paper • 2502.01100 • Published Feb 3 • 17
Question Answering on Patient Medical Records with Private Fine-Tuned LLMs Paper • 2501.13687 • Published Jan 23 • 9
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 97
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model Paper • 2501.05122 • Published Jan 9 • 20
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception Paper • 2410.12628 • Published Oct 16, 2024 • 35
General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model Paper • 2409.01704 • Published Sep 3, 2024 • 83
multilingual vision models Collection Some papers I read for understanding vision models and also adding multilingual capabilities to them • 14 items • Updated Dec 11, 2024 • 2
Maya: An Instruction Finetuned Multilingual Multimodal Model Paper • 2412.07112 • Published Dec 10, 2024 • 27