Are the Q4 and Q5 models R1 or R1-Zero
Are the Q4 and Q5 models R1 or R1-Zero
Can someone verify if the Q4 and Q5 are R1-Zero or just R1, the other quants are labeled just R1, which is what I am looking for?
Are the Q4 and Q5 models R1 or R1-Zero
Can someone verify if the Q4 and Q5 are R1-Zero or just R1, the other quants are labeled just R1, which is what I am looking for?
EDIT : FIXED
The uploaded versions are officially R1 which is correct, not R1-Zero
Yeah interesting 🤔 I would suspect that all these are R1 and the names are just typos. Though I'd say let's wait for @Unsloth to confirm this so we know for sure
Any clarity yet?
I guess I'm the mean time, one could try running those quants, and see if the "think>" tokens are generated
Any clarity yet?
We're uploading the new ones, will ping you guys once it's done.
I guess I'm the mean time, one could try running those quants, and see if the tokens are generated
Yes please let us know how it goes! :)
q5 generates the "think" token and is very good at reasoning. But doesn't the zero version generate this token?
q5 generates the "think" token and is very good at reasoning. But doesn't the zero version generate this token?
We will reupload them today and update you guys
Yeah I had the same question. Will await confirmation before downloading half a TB of data! The Zero model is fascinating in how it was made but definitely want the normal one.
I am running the Q8_0 and it is the R1 not Zero, for better or for worse.
I am running the Q8_0 and it is the R1 not Zero, for better or for worse.
Yea our internal tests and validation from around 10 other people confirm it's R1 but we're still going to reupload then just incase
In any case, the first bytes contain the model name and it says "DeepSeek R1", in the real Zero version it says "DeepSeek R1 Zero".
So it looks like it's R1.
We're uploaading it now - should be up in like 8 hrs but will let yall know
We're uploaading it now - should be up in like 8 hrs but will let yall know
so the Q4 model right here is R1 or R1-zero, cause I have downloaded it.
th q4 files were deleted, maybe theyll reupload..
@gng2info @fsaudm @wuaoscotty123 @frz1 @vmajor @ozzeruk82 @ooj
Hey guy appologies for the delay but we've reuploaded them so they're correct now.
Also we're going to release 1-bit dynamic quant versions very soon meaning the accuracy will be very good for a 1.5-bit GGUF quant version of R1 and will be great to use day-to-day. I'll update u guys once's that's ready - we'll most likely have a blogpost too for it
So sorry I re-uploaded them all - on further inspection the old files were correct, just I screwed up the names - but it's best to download the new versions I uploaded.
I made 4 further uploads with dynamic quantization (better accuracy than normal quants). All dynamic quants leave all layers in a mixture of 4bit and 6bit (ie attention is fully left at 4/6 bits), except the MoE layers which is quantized further down.
DeepSeek R1 has 3 layers of non MoEs, and these are left fully in 4/6bit as well.
MoE Bits | Type | Disk Size | Accuracy | Link | Details |
---|---|---|---|---|---|
1.58bit | IQ1_S | 131GB | Fair | Link | MoE all 1.56bit. down_proj in MoE mixture of 2.06/1.56bit |
1.73bit | IQ1_M | 158GB | Good | Link | MoE all 1.56bit. down_proj in MoE left at 2.06bit |
2.22bit | IQ2_XXS | 183GB | Better | Link | MoE all 2.06bit. down_proj in MoE mixture of 2.5/2.06bit |
2.51bit | Q2_K_XL | 212GB | Best | Link | MoE all 2.5bit. down_proj in MoE mixture of 3.5/2.5bit |
@gng2info @fsaudm @wuaoscotty123 @frz1 @vmajor @ozzeruk82 @ooj Apologies for the ping again, but blogpost for dynamic 1.58bit is out. Would be incredible if you guys could test it and share any results. 🤗
Blog: https://unsloth.ai/blog/deepseekr1-dynamic
Tweet: https://x.com/UnslothAI/status/1883899061893546254