-
BlackMamba: Mixture of Experts for State-Space Models
Paper • 2402.01771 • Published • 23 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 26 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 43
David Samuel
Davidsamuel101
AI & ML interests
NLP, Computer Vision