Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published 12 days ago • 131
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks Paper • 2208.10442 • Published Aug 22, 2022
RedStone: Curating General, Code, Math, and QA Data for Large Language Models Paper • 2412.03398 • Published 21 days ago • 1
Multimodal Latent Language Modeling with Next-Token Diffusion Paper • 2412.08635 • Published 14 days ago • 41
Multimodal Latent Language Modeling with Next-Token Diffusion Paper • 2412.08635 • Published 14 days ago • 41
Multimodal Latent Language Modeling with Next-Token Diffusion Paper • 2412.08635 • Published 14 days ago • 41 • 2
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7 • 111
You Only Cache Once: Decoder-Decoder Architectures for Language Models Paper • 2405.05254 • Published May 8 • 10