Tri Dao's picture

5

Tri Dao

tridao

·

AI & ML interests

None yet

Recent Activity

authored a paper about 1 month ago

RedPajama: an Open Dataset for Training Large Language Models

View all activity

Articles

Bamba: Inference-Efficient Hybrid Mamba2 Model

Organizations

tridao's activity

authored a paper about 1 month ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19 • 47

authored a paper 4 months ago

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27 • 37

updated a collection 7 months ago

Mamba-2

0 items • Updated Jun 3

authored 3 papers 10 months ago

Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling

Paper • 2403.03234 • Published Mar 5 • 11

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29 • 136

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Paper • 2402.10193 • Published Feb 15 • 19

authored a paper 11 months ago

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19 • 54

authored 2 papers about 1 year ago

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Paper • 2310.17157 • Published Oct 26, 2023 • 12

authored 6 papers over 1 year ago

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Paper • 2205.14135 • Published May 27, 2022 • 11

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Paper • 2212.14052 • Published Dec 28, 2022

Hyena Hierarchy: Towards Larger Convolutional Language Models

Paper • 2302.10866 • Published Feb 21, 2023 • 7

Simple Hardware-Efficient Long Convolutions for Sequence Modeling

Paper • 2302.06646 • Published Feb 13, 2023 • 2

StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 29

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Paper • 2307.08691 • Published Jul 17, 2023 • 8