mobiuslabsgmbh/DeepSeek-R1-ReDistill-Qwen-1.5B-v1.0 Text Generation • Updated about 11 hours ago • 193 • 38
Turbo Sparse: Achieving LLM SOTA Performance with Minimal Activated Parameters Paper • 2406.05955 • Published Jun 10, 2024 • 24
ModelCloud/QwQ-32B-Preview-gptqmodel-4bit-vortex-v2 Text Generation • Updated Dec 18, 2024 • 1.47k • 16