The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities Paper • 2411.04986 • Published 4 days ago • 3
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models Paper • 2411.04097 • Published 5 days ago • 3
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos Paper • 2411.04923 • Published 4 days ago • 20
Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models Paper • 2411.04996 • Published 4 days ago • 40
Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination Paper • 2411.03823 • Published 5 days ago • 41
Inference Optimal VLMs Need Only One Visual Token but Larger Models Paper • 2411.03312 • Published 6 days ago • 6
Zebra-Llama: A Context-Aware Large Language Model for Democratizing Rare Disease Knowledge Paper • 2411.02657 • Published 7 days ago • 5
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models Paper • 2411.00743 • Published 10 days ago • 6
Sparsing Law: Towards Large Language Models with Greater Activation Sparsity Paper • 2411.02335 • Published 7 days ago • 10
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published 11 days ago • 57
Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders Paper • 2410.22366 • Published 14 days ago • 72
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays Paper • 2410.21969 • Published 13 days ago • 8
On Memorization of Large Language Models in Logical Reasoning Paper • 2410.23123 • Published 12 days ago • 15
Measuring memorization through probabilistic discoverable extraction Paper • 2410.19482 • Published 17 days ago • 4