Collections
Discover the best community collections!
Collections including paper arxiv:2310.01334
-
Turn Waste into Worth: Rectifying Top-k Router of MoE
Paper • 2402.12399 • Published • 2 -
CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition
Paper • 2402.02526 • Published • 3 -
Buffer Overflow in Mixture of Experts
Paper • 2402.05526 • Published • 8 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 26
-
Non-asymptotic oracle inequalities for the Lasso in high-dimensional mixture of experts
Paper • 2009.10622 • Published • 1 -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper • 2401.15947 • Published • 49 -
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
Paper • 2401.04081 • Published • 71 -
MoE-Infinity: Activation-Aware Expert Offloading for Efficient MoE Serving
Paper • 2401.14361 • Published • 2
-
Linear Self-Attention Approximation via Trainable Feedforward Kernel
Paper • 2211.04076 • Published • 1 -
Greenformer: Factorization Toolkit for Efficient Deep Neural Networks
Paper • 2109.06762 • Published • 1 -
COMCAT: Towards Efficient Compression and Customization of Attention-Based Vision Models
Paper • 2305.17235 • Published • 2 -
Exploring Low Rank Training of Deep Neural Networks
Paper • 2209.13569 • Published • 1
-
Experts Weights Averaging: A New General Training Scheme for Vision Transformers
Paper • 2308.06093 • Published • 2 -
Platypus: Quick, Cheap, and Powerful Refinement of LLMs
Paper • 2308.07317 • Published • 23 -
Beyond Attentive Tokens: Incorporating Token Importance and Diversity for Efficient Vision Transformers
Paper • 2211.11315 • Published • 1 -
LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition
Paper • 2307.13269 • Published • 31
-
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
Paper • 2310.16795 • Published • 26 -
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference
Paper • 2308.12066 • Published • 4 -
Towards MoE Deployment: Mitigating Inefficiencies in Mixture-of-Expert (MoE) Inference
Paper • 2303.06182 • Published • 1 -
EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate
Paper • 2112.14397 • Published • 1