Spaces:
Running
Deepseek r1 32b model is reasoning less and often answering without accuracy
So, I am using both deepseek chat and huggingface chat, thanks to huggingface for inferencing the model, but honestly it is sucking and its answer is way behind deepseek chat which probably uses the r1 700b model, whatever I understand and i think there's issue with the configuration of the model or else why it would answer like a 1.5 billion model, if they are trying to restrict token generation then I would say please don't do that cause it's ruining the model's ability to reason and that's the reason i am using this model over chatgpt, requesting huggingface to take action asap.
This model isn't the original DeepSeek R1... it was never intelligent, anyway.
This model isn't the original DeepSeek R1... it was never intelligent, anyway.
Well it is for sure not because it has less param but sometimes inferencing config matters too, for example temperature, nucleus sampling and probably more ways to increase a model's accuracy. BTW why huggingchat is not optimized, even deepseek's website is much better than huggingface, huggingchat is not optimized for long response, not referring to the model, the problem is with the frontend, they should fix these problems man, I can't even read code responses from hugging face, who tf would use Arial as coding font, Gemini also uses shitty font which makes it harder to read codes, why they can't use general coding font for ai response.
hey @rishadsojon the mono font should now be used properly for code blocks, we're aware there are some issues regarding rendering long answers on lower end hardware, taking a look at that too!
I don't have the arial font issue, but the rendering time for big code blocks is INSANELY slow. Even on my GTX1060 GPU it stops loading the rest of the page when I scroll at some point. Firefox then says that the page is slowing down the website.
Could you share a conv that triggers this btw @Smorty100 just to make sure I can reproduce the same issue
@nsarrazin
The problem only arises when the model writes more than 200 lines of code, but once it gets there, i can't scroll anymore and the website completely locks down the website, having to close the entire browser to get it back to working good.
do you mean my system config, as in, the device itself?
if so, here you go:
Desktop
- OS: Fedora; DE: GNOME; Browser: Firefox (latest daily release); GPU: GTX1060; CPU: Intel i5-6600 (according to neofetch)
Mobile
- Device: Pixel 6; OS: GrapheneOS (Android 15); Browser: Fennec (Firefox fork)
I updated the rendering code yesterday @Smorty100 do you still notice the issue today ? Thanks for the details!
@nsarrazin
I have not checked yet. Honestly, i don't use hugging face that much, because of the models which are kinda idiotic in comparison with chatgpt and the chat ui sucks too, I am not complaining but if i can't get nothing why would i use or anyone would, for example the qwen coder which just sucks, can't solve a single problem and on top of it, that freaking code block which just makes it more irritating.
There's one more issue with the setting modal, it lags. I wouldn't say it's because of my low end hardware but rather because of unoptimized code. I face it in my own website too, using zustand there and it lags the same as the setting modal, it is using unnecessary callbacks or something like that; didn't get time to fix that, I hope that will reduce a bit your pain.
One more thing, if there are 1trillion or 500million param model out there for free then why people would use 3b or 72b param model. I think huggingface should consider inferencing multiple low param model, because no one would want their assistant to be dumb.
Don't get me wrong with that inferencing advice, they just simply can't do their job properly; Happy Inferencing.