Vaibhav Srivastav's picture

Vaibhav Srivastav PRO

reach-vb

·

https://vaibhavs10.github.io

AI & ML interests

TTS + LM performance prediction

Recent Activity

liked a model about 3 hours ago

deepseek-ai/Janus-Pro-1B

new activity about 4 hours ago

deepseek-ai/Janus-Pro-7B:Apply for community grant: Academic project (gpu)

liked a model about 4 hours ago

deepseek-ai/Janus-Pro-7B

View all activity

Articles

Timm ❤️ Transformers: Use any timm model with transformers

Faster Text Generation with Self-Speculative Decoding

Llama can now see and run on your device - welcome Llama 3.2

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

WWDC 24: Running Mistral 7B with Core ML

Welcome Gemma 2 - Google's new open LLM

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

CodeGemma - an official Google release for code LLMs

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

AI Watermarking 101: Tools and Techniques

Deploy MusicGen in no time with Inference Endpoints

Jupyter X Hugging Face

Swift Diffusers: Fast Stable Diffusion for Mac

Organizations

reach-vb's activity

upvoted a collection about 9 hours ago

Qwen2.5-VL

Vision-language model series based on Qwen2.5 • 3 items • Updated about 16 hours ago • 168

upvoted a collection 1 day ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 1 day ago • 74

upvoted 2 articles 3 days ago

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

By

•

7 days ago

• 46

Article

Failure Modes of OpenAI Operator

By

•

3 days ago

• 3

upvoted an article 4 days ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

5 days ago

• 83

upvoted a collection 4 days ago

SmolVLM 256M & 500M

Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 4 days ago • 54

upvoted an article 4 days ago

Article

Yay! Organizations can now publish blog Articles

By

•

7 days ago

• 30

upvoted a collection 5 days ago

Eagle 2

Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 9 items • Updated 4 days ago • 19

upvoted an article 5 days ago

Article

Upgrading Kokoro: natural TTS for short bursts

By

•

Nov 22, 2024

• 26

upvoted an article 7 days ago

Article

Inference Endpoints Changelog 🚀

By

•

Oct 11, 2024

• 20

upvoted an article 10 days ago

Article

Timm ❤️ Transformers: Use any timm model with transformers

12 days ago

• 34

upvoted a paper 10 days ago

OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking

Paper • 2501.09751 • Published 11 days ago • 46

upvoted 2 articles 11 days ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

13 days ago

• 124

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

12 days ago

• 60

upvoted a collection 12 days ago

OuteTTS 0.3

4 items • Updated 12 days ago • 18

upvoted 2 articles 12 days ago

Article

MiniMax-01 is Now Open-Source: Scaling Lightning Attention for the AI Agent Era

By

•

12 days ago

• 40

Article

Diving into MiniMax01 405B MoE

By

•

12 days ago

• 17

upvoted a paper 19 days ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published 19 days ago • 249

upvoted a collection 19 days ago

Sa2VA model zoo

4 items • Updated 13 days ago • 28

upvoted a collection 21 days ago

Cosmos

The collection of Cosmos models • 31 items • Updated 11 days ago • 250