Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
36.4
TFLOPS
15
68
345
alkinun
AtAndDev
Follow
arthrod's profile picture
not-lain's profile picture
thomwolf's profile picture
23 followers
·
21 following
alkinun
alkinun
AI & ML interests
LLMs, Alignment, Merging, Unsloth, DPO, SFT, ORPO, SPIN..
Recent Activity
liked
a model
2 days ago
Qwen/Qwen2-VL-72B-Instruct
reacted
to
onekq
's
post
with 👍
2 days ago
So 🐋DeepSeek🐋 hits the mainstream media. But it has been a star in our little cult for at least 6 months. Its meteoric success is not overnight, but two years in the making. To learn their history, just look at their 🤗 repo https://huggingface.co./deepseek-ai * End of 2023, they launched the first model (pretrained by themselves) following Llama 2 architecture * June 2024, v2 (MoE architecture) surpassed Gemini 1.5, but behind Mistral * September, v2.5 surpassed GPT 4o mini * December, v3 surpassed GPT 4o * Now R1 surpassed o1 Most importantly, if you think DeepSeek success is singular and unrivaled, that's WRONG. The following models are also near or equal the o1 bar. * Minimax-01 * Kimi k1.5 * Doubao 1.5 pro
replied
to
mitkox
's
post
2 days ago
llama.cpp is 26.8% faster than ollama. I have upgraded both, and using the same settings, I am running the same DeepSeek R1 Distill 1.5B on the same hardware. It's an Apples to Apples comparison. Total duration: llama.cpp 6.85 sec <- 26.8% faster ollama 8.69 sec Breakdown by phase: Model loading llama.cpp 241 ms <- 2x faster ollama 553 ms Prompt processing llama.cpp 416.04 tokens/s with an eval time 45.67 ms <- 10x faster ollama 42.17 tokens/s with an eval time of 498 ms Token generation llama.cpp 137.79 tokens/s with an eval time 6.62 sec <- 13% faster ollama 122.07 tokens/s with an eval time 7.64 sec llama.cpp is LLM inference in C/C++; ollama adds abstraction layers and marketing. Make sure you own your AI. AI in the cloud is not aligned with you; it's aligned with the company that owns it.
View all activity
Organizations
Posts
3
view post
Post
434
Deepseek gang on fire fr fr
See translation
view post
Post
1551
R1 is out! And with a lot of other R1 releated models...
See translation
View all posts
spaces
3
Sort: Recently updated
Sleeping
🐢
DeepSense.ai
Bicycle and E-Bike Detection Model
Sleeping
💻
marco-qwq-7B
Sleeping
💻
AIDC AI Marco O1
models
7
Sort: Recently updated
AtAndDev/marco-qwq-7B
Text Generation
•
Updated
Dec 8, 2024
•
13
AtAndDev/Ogno-Monarch-Neurotic-9B-Passthrough
Text Generation
•
Updated
Mar 1, 2024
•
64
AtAndDev/Ogno-Monarch-Neurotic-7B-Dare-Ties
Text Generation
•
Updated
Mar 1, 2024
•
65
AtAndDev/Marcoro14-7B-Slerp
Text Generation
•
Updated
Mar 1, 2024
•
6
AtAndDev/CapybaraMarcoroni-7B
Text Generation
•
Updated
Jan 7, 2024
•
904
AtAndDev/ShortKing-3b-v0.2
Text Generation
•
Updated
Oct 2, 2023
•
77
•
2
AtAndDev/ShortKing-1.4b-v0.1
Text Generation
•
Updated
Sep 29, 2023
•
2.74k
•
2
datasets
12
Sort: Recently updated
AtAndDev/symbolm
Viewer
•
Updated
4 days ago
•
20k
•
45
AtAndDev/symlm
Viewer
•
Updated
11 days ago
•
10.1k
•
33
AtAndDev/chain-of-diffusion
Viewer
•
Updated
21 days ago
•
6.45k
•
80
AtAndDev/clip-bicycle-e-bike
Viewer
•
Updated
26 days ago
•
6k
•
47
AtAndDev/QwQ-LongCoT-59k-cleaned
Viewer
•
Updated
Dec 6, 2024
•
59.2k
•
59
AtAndDev/sedir-clean
Viewer
•
Updated
Dec 5, 2024
•
11.8k
•
59
AtAndDev/sedir-unclean
Viewer
•
Updated
Dec 5, 2024
•
19.9k
•
50
AtAndDev/ultrachat_200k_formatted
Viewer
•
Updated
Oct 10, 2024
•
208k
•
54
AtAndDev/MedInstruct
Viewer
•
Updated
Jul 20, 2024
•
216
•
45
AtAndDev/MedRag-textbooks-stella_en_400M_v5
Viewer
•
Updated
Jul 14, 2024
•
126k
•
45
Expand 12 datasets