VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper • 2501.13106 • Published 5 days ago • 70
BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature Paper • 2501.07171 • Published 15 days ago • 49
Trajectory Attention for Fine-grained Video Motion Control Paper • 2411.19324 • Published Nov 28, 2024 • 12
GREEN: Generative Radiology Report Evaluation and Error Notation Paper • 2405.03595 • Published May 6, 2024
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models Paper • 2411.04097 • Published Nov 6, 2024 • 5
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 124
MobileQuant: Mobile-friendly Quantization for On-device Language Models Paper • 2408.13933 • Published Aug 25, 2024 • 15
Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers Paper • 2103.15679 • Published Mar 29, 2021
Transformer Interpretability Beyond Attention Visualization Paper • 2012.09838 • Published Dec 17, 2020
RoentGen: Vision-Language Foundation Model for Chest X-ray Generation Paper • 2211.12737 • Published Nov 23, 2022 • 2
RadAdapt: Radiology Report Summarization via Lightweight Domain Adaptation of Large Language Models Paper • 2305.01146 • Published May 2, 2023 • 1
ViLLA: Fine-Grained Vision-Language Representation Learning from Real-World Data Paper • 2308.11194 • Published Aug 22, 2023
Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts Paper • 2309.07430 • Published Sep 14, 2023 • 27
CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation Paper • 2401.12208 • Published Jan 22, 2024 • 22
Adversarial Open Domain Adaptation for Sketch-to-Photo Synthesis Paper • 2104.05703 • Published Apr 12, 2021
MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer Paper • 2311.12052 • Published Nov 18, 2023 • 31