7 61 27

Frank Sommers PRO

fsommers

fsommers

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering

updated a collection 9 days ago

Misc papers

upvoted a paper 9 days ago

ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents

View all activity

Organizations

fsommers's activity

upvoted a paper 5 days ago

NitiBench: A Comprehensive Studies of LLM Frameworks Capabilities for Thai Legal Question Answering

Paper • 2502.10868 • Published 25 days ago • 2

upvoted a paper 9 days ago

ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents

Paper • 2502.18017 • Published 16 days ago • 18

upvoted 2 articles 12 days ago

Article

SmolVLM2: Bringing Video Understanding to Every Device

21 days ago

• 205

Article

SigLIP 2: A better multilingual vision language encoder

20 days ago

• 132

upvoted a paper 14 days ago

Executable Code Actions Elicit Better LLM Agents

Paper • 2402.01030 • Published Feb 1, 2024 • 97

upvoted 2 papers 20 days ago

Scalable Vision Language Model Training via High Quality Data Curation

Paper • 2501.05952 • Published Jan 10 • 1

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 21 days ago • 160

upvoted a collection 21 days ago

ColQwen2 Models

Collection

Pre-trained checkpoints for the ColQwen2 model. • 4 items • Updated Jan 23 • 4

upvoted a collection 25 days ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 8 items • Updated 17 days ago • 394

upvoted an article 26 days ago

Article

We now support VLMs in smolagents!

Jan 24

• 92

upvoted a paper about 1 month ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Paper • 2502.01100 • Published Feb 3 • 17

upvoted an article about 1 month ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

• 194

upvoted a paper about 1 month ago

Question Answering on Patient Medical Records with Private Fine-Tuned LLMs

Paper • 2501.13687 • Published Jan 23 • 9

upvoted a paper about 2 months ago

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published Dec 25, 2024 • 97

upvoted an article about 2 months ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 151

upvoted a paper 2 months ago

Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model

Paper • 2501.05122 • Published Jan 9 • 20

upvoted 2 papers 3 months ago

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception

Paper • 2410.12628 • Published Oct 16, 2024 • 35

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Paper • 2409.01704 • Published Sep 3, 2024 • 83

upvoted a collection 3 months ago

multilingual vision models

Collection

Some papers I read for understanding vision models and also adding multilingual capabilities to them • 14 items • Updated Dec 11, 2024 • 2

upvoted a paper 3 months ago

Maya: An Instruction Finetuned Multilingual Multimodal Model

Paper • 2412.07112 • Published Dec 10, 2024 • 27