view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 158
Upcycling Large Language Models into Mixture of Experts Paper • 2410.07524 • Published Oct 10, 2024 • 4 • 3
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published Apr 25, 2024 • 78 • 12