Star Attention: Efficient LLM Inference over Long Sequences Paper • 2411.17116 • Published 30 days ago • 47
Continuous Risk Factor Models: Analyzing Asset Correlations through Energy Distance Paper • 2410.23447 • Published Oct 30 • 1
γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models Paper • 2410.13859 • Published Oct 17 • 7
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer By Pringled • Oct 14 • 61
LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks Paper • 2410.01744 • Published Oct 2 • 26
PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation Paper • 2410.01680 • Published Oct 2 • 32
CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling Paper • 2409.19291 • Published Sep 28 • 19
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications Paper • 2408.11878 • Published Aug 20 • 52
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time Paper • 2408.13233 • Published Aug 23 • 21