Spaces:

huggingchat
/

chat-ui

Running

App Files Files Community

679

Deepseek r1 32b model is reasoning less and often answering without accuracy

#673

by rishadsojon - opened Jan 31

Discussion

rishadsojon

Jan 31

So, I am using both deepseek chat and huggingface chat, thanks to huggingface for inferencing the model, but honestly it is sucking and its answer is way behind deepseek chat which probably uses the r1 700b model, whatever I understand and i think there's issue with the configuration of the model or else why it would answer like a 1.5 billion model, if they are trying to restrict token generation then I would say please don't do that cause it's ruining the model's ability to reason and that's the reason i am using this model over chatgpt, requesting huggingface to take action asap.

Infranta

27 days ago

This model isn't the original DeepSeek R1... it was never intelligent, anyway.

rishadsojon

12 days ago

This model isn't the original DeepSeek R1... it was never intelligent, anyway.

Well it is for sure not because it has less param but sometimes inferencing config matters too, for example temperature, nucleus sampling and probably more ways to increase a model's accuracy. BTW why huggingchat is not optimized, even deepseek's website is much better than huggingface, huggingchat is not optimized for long response, not referring to the model, the problem is with the frontend, they should fix these problems man, I can't even read code responses from hugging face, who tf would use Arial as coding font, Gemini also uses shitty font which makes it harder to read codes, why they can't use general coding font for ai response.

nsarrazin

Hugging Chat org 12 days ago

hey @rishadsojon the mono font should now be used properly for code blocks, we're aware there are some issues regarding rendering long answers on lower end hardware, taking a look at that too!

Smorty100

12 days ago

•

edited 12 days ago

I don't have the arial font issue, but the rendering time for big code blocks is INSANELY slow. Even on my GTX1060 GPU it stops loading the rest of the page when I scroll at some point. Firefox then says that the page is slowing down the website.

nsarrazin

Hugging Chat org 12 days ago

Could you share a conv that triggers this btw @Smorty100 just to make sure I can reproduce the same issue

Smorty100

11 days ago

@nsarrazin
The problem only arises when the model writes more than 200 lines of code, but once it gets there, i can't scroll anymore and the website completely locks down the website, having to close the entire browser to get it back to working good.

do you mean my system config, as in, the device itself?
if so, here you go:

Desktop

OS: Fedora; DE: GNOME; Browser: Firefox (latest daily release); GPU: GTX1060; CPU: Intel i5-6600 (according to neofetch)

Mobile

Device: Pixel 6; OS: GrapheneOS (Android 15); Browser: Fennec (Firefox fork)

nsarrazin

Hugging Chat org 11 days ago

I updated the rendering code yesterday @Smorty100 do you still notice the issue today ? Thanks for the details!

HakimAI2

10 days ago

•

edited 10 days ago

No description provided.

rishadsojon

8 days ago

@nsarrazin I have not checked yet. Honestly, i don't use hugging face that much, because of the models which are kinda idiotic in comparison with chatgpt and the chat ui sucks too, I am not complaining but if i can't get nothing why would i use or anyone would, for example the qwen coder which just sucks, can't solve a single problem and on top of it, that freaking code block which just makes it more irritating.
There's one more issue with the setting modal, it lags. I wouldn't say it's because of my low end hardware but rather because of unoptimized code. I face it in my own website too, using zustand there and it lags the same as the setting modal, it is using unnecessary callbacks or something like that; didn't get time to fix that, I hope that will reduce a bit your pain.

One more thing, if there are 1trillion or 500million param model out there for free then why people would use 3b or 72b param model. I think huggingface should consider inferencing multiple low param model, because no one would want their assistant to be dumb.

Don't get me wrong with that inferencing advice, they just simply can't do their job properly; Happy Inferencing.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment