CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Paper • 2404.10513 • Published Apr 16 • 2
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Paper • 2408.02545 • Published Aug 5 • 35
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Paper • 2408.02545 • Published Aug 5 • 35
Distributed Speculative Inference of Large Language Models Paper • 2405.14105 • Published May 23 • 16
Distributed Speculative Inference of Large Language Models Paper • 2405.14105 • Published May 23 • 16
An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs Paper • 2306.16601 • Published Jun 28, 2023 • 4