victor

Hugging Chat org Feb 20

•

edited Sep 23

Here we can discuss about HuggingChat available models.

victor pinned discussion Feb 20

smbg56

Feb 21

what are limits of using these? how many api calls can i send them per month?

Kondwani88

Feb 21

How can I know which model am using

Alexandro14

Feb 21

How can I know which model am using

at the bottom of your screen:

EndsM

Feb 21

Out of all these models, Gemma, which was recently released, has the newest information about .NET. However, I don't know which one has the most accurate answers regarding coding

EveryPizza

Feb 21

Gemma seems really biased. With web search on, it says that it doesn't have access to recent information asking it almost anything about recent events. But when I ask it about recent events with Google, I get responses with the recent events.

GitMeSomeLlama

Feb 21

apparently gemma cannot code?

EndsM

Feb 22

Gemma is just like Google's Gemini series models, it have a very strong moral limit put on, any operation that may related to file operation, access that might be deep, would be censored and refused to reply.
So even there are solution for such things in its training data, it will just be filtered and ignored.
But still didn't test the coding accuracy that doesn't related to these kind of "dangerous" operations

OnentMain3N2B3

Feb 22

This comment has been hidden

nicolollo

Feb 22

is it possible to know what parameters this models are running ?

nsarrazin

Hugging Chat org Feb 22

is it possible to know what parameters this models are running ?

It's all here! https://github.com/huggingface/chat-ui/blob/main/.env.template

nicolollo

Feb 22

is it possible to know what parameters this models are running ?

It's all here! https://github.com/huggingface/chat-ui/blob/main/.env.template

thanks this is super useful OWO

SAMMdev

Feb 27

What happened to Falcon? It was my favorite. :(

julien-c

Hugging Chat org Feb 27

•

edited Feb 27

@SAMMdev Falcon was too costly to run at scale (for now), we might put back a more optimized version in the future

rtallakuate

Feb 27

I would like to use "mistralai/Mixtral-8x7B-Instruct-v0.1";
Please could tell me what is the precision of the model behind the chat? Thanks

Aspect004

Feb 27

This comment has been hidden

EveryPizza

Feb 27

@SAMMdev Falcon was too costly to run at scale (for now), we might put back a more optimized version in the future

What if we use Falcon 70B?

GitMeSomeLlama

Feb 27

smaug 72B would be a great addition

Awillia91

Mar 4

I’m unable to get output from CodeLlama

Smorty100

Mar 14

I'm also voting for Samsung 72B. We already have the two Llama 70B models on here soo to me it seems reasonable to integrate this one as well.

EveryPizza

Mar 18

This is probably not going to happen, but xai-org/grok-1 would be insane to have here

DYB5784

Mar 25

IYH Why is the title of most chats (on the left panels roster) "🤖 Hello! I am a language model AI assistant."?

This implies that the system prompt of my assistants is not the fundamental prompt, but there is an inbuilt base prompt that is run before my system prompt .. is this correct roughly and if so how do I change this base prompt for Mistral LLM ?

Shinku

Mar 27

Could you consider DBRX Instruct and Command-R? The official space for DBRX Instruct is too limited (it only allows for a 5-turn conversation) and there is no space for Command-R.

DYB5784

Mar 27

IYH thank you for your advice. Apologies I have no idea what the concepts mean or what to do "Could you consider DBRX Instruct and Command-R? The official space for DBRX Instruct is too limited (it only allows for a 5-turn conversation) and there is no space for Command-R." (fwiw I prompted mistral about it and it did not know either.)
Would you kindly elaborate (or point me towards a resource that explains this

Shinku

Mar 27

IYH thank you for your advice. Apologies I have no idea what the concepts mean or what to do "Could you consider DBRX Instruct and Command-R? The official space for DBRX Instruct is too limited (it only allows for a 5-turn conversation) and there is no space for Command-R." (fwiw I prompted mistral about it and it did not know either.)
Would you kindly elaborate (or point me towards a resource that explains this

Huggingface will notify you when someone posts in a discussion you've commented on, even if they didn't directly reply to you. I was suggesting two new models, unrelated to your question.

GloBuC13

Apr 1

Which model is better to use?
How to know difference of them?

SvCy

Apr 8

IYH Why is the title of most chats (on the left panels roster) "🤖 Hello! I am a language model AI assistant."?

This implies that the system prompt of my assistants is not the fundamental prompt, but there is an inbuilt base prompt that is run before my system prompt .. is this correct roughly and if so how do I change this base prompt for Mistral LLM ?

@DYB5784 HF Chat has a Mistral 7B model setup with system prompt for the task of summarizing the first chat prompt/msg into a title for the chat history log, so unless one explicitly addresses that in the first msg it is what it is ig. and we can always rename it. Still, i think it would have been awesome if we could customize the naming style/prompt it ourselves.

Bethenny

Apr 9

Is openchat/openchat-3.5-0106 coming back? Was it removed to be upgraded?

EveryPizza

Apr 9

Is openchat/openchat-3.5-0106 coming back? Was it removed to be upgraded?

It looks like they also removed the Meta models :(

Ironmole

Apr 10

Hope they add command r instead of bringing those back tbh.

Bethenny

Apr 10

Hope they add command r instead of bringing those back tbh.

What is command r? I'm a newb.

AndreiSva

Apr 10

Hope they add command r instead of bringing those back tbh.

What is command r? I'm a newb.

Command-r+ is a new LLM from Cohere that overtook GPT-4 on the openllm leaderboard.

julien-c

Hugging Chat org Apr 10

hey!

On HuggingChat we aim to always propose a small selection of models which will evolve over time as the field of ML progresses forward 🔥

Stay tuned!

KingNish

Apr 10

On Hugging Chat we aim to always propose a small selection of models which will evolve over time as the field of ML progresses forward 🔥
Stay tuned!

Yup, small models are better and lighter (Cost friendly) + now Hugging chat ai has internet access so small models like (Mixtrail, Nous hermes, etc.) can even performs very better in many areas then many 70b models, and
We are happy to see what's coming next 🔥🔥.

LostSpirit

Apr 10

Hope they add command r instead of bringing those back tbh.

The Meta ones felt misaligned and gave a lot of refusals. The 70b code one would lecture and moralize even with nothing bad in the prompt.

I hope LLaMA 3 isn't as misaligned mess.

KingNish

Apr 10

The Meta ones felt misaligned and gave a lot of refusals. The 70b code one would lecture and moralize even with nothing bad in the prompt.

This is because they do not do fine tuning, Manytime fully Finetuned model of Llama 7b is better than no fine tuned llama 70b

victor

Hugging Chat org Apr 10

•

edited Apr 10

Cohere Command R+ is now on HuggingChat!

SvCy

Apr 10

@Victor Thank you for the new model! but if possible, i think a slight warning/notification should be very helpful to us about which model will be taken down!
Goodbye, OpenChat! it was really good for 7B!

KingNish

Apr 10

•

edited Apr 10

@Victor Thank you for the new model! but if possible, i think a slight warning/notification should be very helpful to us about which model will be taken down!
Goodbye, OpenChat! it was really good for 7B!

Agree with 1st point

Ironmole

Apr 10

@Victor Thank you for the new model! but if possible, i think a slight warning/notification should be very helpful to us about which model will be taken down!
Goodbye, OpenChat! it was really good for 7B!

Agree

Disagree...

Ironmole

Apr 10

Cohere Command R+ is now on HuggingChat!

... Hey Victor, if you're gonna surprise us with new models like this, then you can remove anything you want without notify anyone, not even Clement, xd.

But jokes aside, this is just great, if your adding/removal policy keeps like this, in 3 months we will have hugging-face assistants for everything, long context, coding, reasoning/creativity, etcetera.

Thanks a lot!!!

P.D.: I was expecting just Command R, but having plus with all the HF interface means that I will be able to make a lot of assistants that on the past only worked decently as GPTs with GPT4.

KingNish

Apr 10

@Victor Thank you for the new model! but if possible, i think a slight warning/notification should be very helpful to us about which model will be taken down!
Goodbye, OpenChat! it was really good for 7B!

Agree

Disagree...

why bro

SvCy

Apr 10

•

edited Apr 10

you can remove anything you want without notify anyone

@Ironmole you're literally ok with leaving all the active chats abandoned, aren't you? what can we say here? but lots of users will be kinda saddened if active/hanging chats are suddenly no longer continuable. (i know they usually take down the models with the least traffic, so, that's how it is ig)
and it seems all the other assistants have been migrated to mistralai/Mixtral-8x7B-Instruct-v0.1.

nsarrazin

Hugging Chat org Apr 10

Yes we migrated all assistants with deprecated models to the default model, which at the time was Mixtral 8x7B!

nksh

Apr 11

command r + is really good

Maxiscool

Apr 11

I’m worried about that it’s not gonna be free forever, like don’t get me wrong I have FULL faith in the hugging chat team, it’s just this in my eyes it’s a perfect replacement to ChatGPT. So I just need some reassurance it’ll stay free

KingNish

Apr 11

•

edited Apr 11

I’m worried about that it’s not gonna be free forever, like don’t get me wrong I have FULL faith in the hugging chat team, it’s just this in my eyes it’s a perfect replacement to ChatGPT. So I just need some reassurance it’ll stay free

I think that It'll stay free.
But if they have budget issue then.
They can integrate ads to make it free forever.
and also introduce premium features (Like some premium model only use by premium or Badge to pro, etc.)

AiArtEnjoyer

Apr 11

Please leave that Command R plus unquantized on huggingchat, I'd even pay 30$ a month for it. In my opinion its perfect for translating. I would use it locally but I don't have a server that could run the full model and using quants will make the model worse.

nksh

Apr 11

I would like to pay 9$ per month for longer context + relaxed rate limit + unquantized usage of huggingface Chat

Aspect004

Apr 11

Hope you guys keep HuggingChat free forever 🙏

KingNish

Apr 11

As hugging face gives access to Host unlimited models, datasets, and Spaces for free.
Hope so Hugging Chat will remain free.

A famous Hindi Quote - "Umeed Pe duniya kayam Hai"

Translation - "The world is alive in hope."

Well, see what happens in future.

AiArtEnjoyer

Apr 11

Upon closer inspection it seems like Nous-Hermes-2-Mixtral-8x7B-DPO is still a bit better than command r plus at translating from Chinese to english. It understands the meaning a bit more and especially writes it far better to read. I wonder how good the new 8x22 instruct model of mistral is gonna be. Anyway all the models are really good and have amazing uses! I hope we can access those that get released in the future too. Thank you very much for hosting them.

julien-c

Hugging Chat org Apr 12

Umeed Pe duniya kayam Hai

💯

SvCy

Apr 12

@nsarrazin will assistant creators get a choice in which model to migrate to? i think this should be an option as recreation in another model is like starting anew.

a past comment:

What will happen to the Assistant if a model is taken down? Migrate to new llm with context token +prompt as we/bot authors can change sys prompt of the assistants anytime? unlikely ig. or we could have a migration system for our old chats.

shall there be a "View Sys Prompt" just like in regular chats beside/below the bot button? As the assistant button at the top shows the latest prompt only while the chat might have started with another prompt. (doesn't change the already active chat really)(once it recognized the changed sys prompt upon me mentioning only a part of it)

nksh

Apr 12

zephyr mixtral 8x 22b from hugging face comming soon ?
zephyr-orpo-141b-A35b-v0.1

Gerrytheskull

Apr 12

What happened to the openchat model why was it removed

KingNish

Apr 12

@SvCy 1. You can change to any llm even after making bot, or his llm was removed.

KingNish

Apr 12

What happened to the openchat model why was it removed

Because very few people are using it.

nsarrazin

Hugging Chat org Apr 12

We just released HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1 on HuggingChat!

Try it out here: https://huggingface.co./chat/models/HuggingFace4/zephyr-orpo-141b-A35b-v0.1

nsarrazin

Hugging Chat org Apr 12

Shout out to @nicksuh who called it early 😅

SvCy

Apr 12

•

edited Apr 12

What happened to the openchat model why was it removed

@Gerrytheskull models come and go.. nothing is permanent sadly.. besides OpenChat wasn't being used by that many of users i think. plus, new models were added.. command r+ and now zephyr

SvCy

Apr 12

@SvCy 1. You can change to any llm even after making bot, or his llm was removed.

@KingNish oh users can change llms after creation now? sounds great.. thanks for the info!

nksh

Apr 15

@nsarrazin could you add model usage-over time graph on the model list page?
It would be more engaging and fun + new users can see what's trending.

LordVile

Apr 16

Check out my models-https://hf.co/chat/assistant/65c8539d02294f8760ccf784

KingNish

Apr 16

@nsarrazin could you add model usage-over time graph on the model list page?
It would be more engaging and fun + new users can see what's trending.

feature like Assistant of the week (Like space have space of the week)

CmetankaPY

Apr 17

Is it possible to add the WizardLM-2-8x22B model to the available models?

KingNish

Apr 17

Wizard is super competitor of current GPT4.

Aspect004

Apr 17

Wizard is super competitor of current GPT4.

Wizard seems like a killer model! We would love to see it on HuggingChat.

KingNish

Apr 17

Wizard seems like a killer model! We would love to see it on HuggingChat.

There is only one big problem with this is that it has 141B parameters which makes it slow.

CmetankaPY

Apr 17

•

edited Apr 17

There is only one big problem with this is that it has 141B parameters which makes it slow.

The CohereForAI/c4ai-command-r-plus 110B params model works normally, so this should also work in normal mode. Additionally, there is the HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1 model with 141B params that also works quickly and is available in HuggingChat.

KingNish

Apr 17

@CmetankaPY Ohh, i forget about them.

KingNish

Apr 18

@CmetankaPY I found a discussion which stating that Zephyr has only 35b active parameters

https://huggingface.co./HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/discussions/9

Aspect004

Apr 18

Did anyone notice that Zephyr 141B-A35B isn't even nearly as good as Command R+, despite having more parameters? I also noticed that some smaller models perform way better than Zephyr 141B-A35B.

KingNish

Apr 18

Did anyone notice that Zephyr 141B-A35B isn't even nearly as good as Command R+, despite having more parameters? I also noticed that some smaller models perform way better than Zephyr 141B-A35B.

Because zephyr ha only 35b active parameters not 141b.
Read this for more info - https://huggingface.co./HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/discussions/9

Taf2023

Apr 18

Please add AI generated images.

KingNish

Apr 18

•

edited Apr 18

Please add AI generated images.

You can use image generation in chat using pollination

Some Example:-
https://hf.co/chat/assistant/6612cb237c1e770b75c5ebad
https://hf.co/chat/assistant/65bff23f5560c1a5c0c9dcbd

nsarrazin

Hugging Chat org Apr 18

🚨 Meta Llama 3 70B is now available on Hugging Chat!

Let us know what you think about this one!

nksh

Apr 18

This comment has been hidden

Aspect004

Apr 18

•

edited Apr 18

Llama-3 seems great, but I expected it to beat GPT-4 😅. So far can't see any open-source model that comes close to Command R+ performance

KingNish

Apr 18

Llama-3 seems great, but I expected it to beat GPT-4 😅. So far can't see any open-source model that comes close to Command R+ performance

Wizard Beated Command R+ and Even a very good competitor of ChatGPT

Aspect004

Apr 18

•

edited Apr 18

Llama-3 seems great, but I expected it to beat GPT-4 😅. So far can't see any open-source model that comes close to Command R+ performance

Wizard Beated Command R+ and Even a very good competitor of ChatGPT

I believe Wizard will be the new open-source king, but I can't find it anywhere, I think Microsoft deleted it for some reason.

julien-c

Hugging Chat org Apr 18

I believe Wizard will be the new open-source king, but I can't find it anywhere, I think Microsoft deleted it for some reason.

what did Satya see

KingNish

Apr 18

•

edited Apr 23

I believe Wizard will be the new open-source king, but I can't find it anywhere, I think Microsoft deleted it for some reason.

Hope to SEE Wizard on hugging face.

Macaroon76

Apr 18

•

edited Apr 18

Hey victor could you adjust the repetition penalty for llama? Because I’m trying to do some creative writing but it literally gives me the same output every time I retry

KingNish

Apr 19

Hey victor could you adjust the repetition penalty for llama? Because I’m trying to do some creative writing but it literally gives me the same output every time I retry

just do it yourself from advanced setting at the bottom of models name

Maxiscool

Apr 19

Hey victor could you adjust the repetition penalty for llama? Because I’m trying to do some creative writing but it literally gives me the same output every time I retry

just do it yourself from advanced setting at the bottom of models name

This is all I see

KingNish

Apr 19

•

edited Apr 19

This is all I see

Click create new assistant then you will able to see

deleted

Apr 19

•

edited Apr 19

The quality of Dolphin-Mistral/Mixtral of Cognitivecomputations is much better than that of Nous-Hermes, which may be a more suitable choice. I also used them in my own local ollama - until Command-R+ subverted the game.

P.S. Llama3 is so bad for my use. It is not even as good as the quantitative version of the above two models.

deleted

Apr 19

•

edited Apr 19

I just checked the model configuration of Command-R-Plus and noticed that the context window is limited. Is it because of cost consideration? If so, I hope to add a Q4 version for 128K-context -window support——and it should be much faster.

KingNish

Apr 19

I just checked the model configuration of Command-R-Plus and noticed that the context window is limited. Is it because of cost consideration? If so, I hope to add the Q4 version for 128K-context -window support——and it should be much faster.

But what about quality, quantization decreases quality very much.

deleted

Apr 19

•

edited Apr 19

I just checked the model configuration of Command-R-Plus and noticed that the context window is limited. Is it because of cost consideration? If so, I hope to add the Q4 version for 128K-context -window support——and it should be much faster.

But what about quality, quantization decreases quality very much.

Then Q8? with extremely low Temp, Top_P and Top_K. In any case, the quality of command-R+ surpasses most models.

In addition, the impact of quantification on quality is not so devastating. The latest research can even be quantified with 1bit to achieve a nearly non-quantification effect.

KingNish

Apr 19

•

edited Apr 19

Detailed review of Llama 3 70B:

Coding: 8/10

Capability: Llama 3 is capable of generating code snippets in various programming languages, including Python, Java, C++, and JavaScript. It can also help with code completion, debugging, and optimization.

Limitation: While it can generate code, it may not always be correct or efficient. It may also struggle with complex algorithms or nuanced programming concepts.

Example: I asked Llama3 to write 10 complex questions. It generated a correct solution for 9, but some of them were not the best one.

Creative Writing: 9/10

Capability: Llama 3 is capable of generating creative writing, including stories, poetry, and dialogues. It can understand context, tone, and style, and produce writing that is engaging and coherent.

Limitation: While it can generate creative writing, it may lack the nuance and depth of human-written work. It may also struggle with complex themes or abstract concepts.

Example: I gave 10 creative story generation tasks to him. It generated a engaging and well-structured story, but it lacked the emotional depth and complexity of a human-written work.

Multiple Language: 8.5/10

Capability: Llama 3 is capable of understanding and generating text in multiple languages, including English, Hindi, Chinses, Japanese, Spanish, French, German, Italian, and many others. It can also translate text from one language to another.

Limitation: While it can understand and generate text in multiple languages, it may not always be perfect in terms of grammar, syntax, or idiomatic expressions.

Example: I givee Llama 3 10 paragraphs of different languages to translate. It generated a accurate translation, but it lacked emotions, nuance and cultural context of a human.

General Knowledge: 9/10

Capability: Llama 3 has a vast knowledge base and can answer questions on a wide range of topics, including history, science, technology, literature, and more.

Limitation: While it has a vast knowledge base, it may not always be up-to-date or accurate. It may also struggle with abstract or nuanced concepts.

Example: I asked llama 3 about 10 diff complex GK questions . It generated a accurate and informative response, but it lacked the depth and nuance.

Maths: 6.5/10

Capability: llaama 3 is capable of solving mathematical problems, including algebra, geometry, calculus, and more. It can also help with mathematical concepts and theories.

Limitation: While it can solve mathematical problems, it may not always be able to explain the underlying concepts or find efficient approach and many times give wrong solutions.

Example: I asked Llama 3 to solve 10 complex high school problem. It generated a correct solution for 6 only, in 1 it follow right method at half and remaining 3 are purely incorrect.

Internet Search: 8/10

Capability: Llama3 can search the internet and provide relevant information on a wide range of topics. It can also help with finding specific information or answering complex questions.

Limitation: While it can search the internet, it may not always be able to evaluate the credibility or accuracy of the sources it finds.

Comparison with other models:

Llama 2
Llama 3 is a significant improvement over LLaMA 2 in terms of its capabilities and performance. It has a more advanced language model, better understanding of context and nuance, and improved generation capabilities. It is also more knowledgeable and accurate in its responses.
.
.
.
(More to be added)
.
.
.
Overall, Meta-Llama-3-70B-Instruct is a powerful and versatile language model that can perform a wide range of tasks and answer complex questions. While it has its limitations, it is a significant improvement over previous language models and has the potential to revolutionize the field of natural language processing.
.....................................................................................................
If you liked the review and want review for more models Give a thumbs up 👍

deleted

Apr 19

•

edited Apr 22

Detailed review of Llama 3 70B:

Please do not use LLMs-style correct nonsense to describe the model's performance, thank you!

Note: Why do I think Dolphin performs better?

System prompt-free cross-language capabilities. When communicating in Chinese, Llama(1/2/3) or vanilla mistral 7B must be induced with system prompts to spit out fragmented Chinese. Nous-Hermes, CR+, and the Dolphin series do not have this problem.
Uncensored. Dolphin will never reject you.
It even has a programming-specialized version based on starcoder2.

KingNish

Apr 19

•

edited Apr 19

Detailed review of Llama 3 70B:

Please do not use LLMs-style correct nonsense to describe the model's performance, thank you!

I wrote this entirely by myself, and you're claiming it's nonsense generated by LLM.

nksh

Apr 19

Repetition penalty for llama3 needs to be higher

Macaroon76

Apr 19

I think we should add dolphin as it’s a good model

SvCy

Apr 19

noticed that current chats are not being named. can we assume it's under work for now?

JeromeMore

Apr 22

Do you plan to release mistralai/Mixtral-8x22B-Instruct-v0.1 to the chat ? meta-llama/Meta-Llama-3-8B-Instruct could be also great.

AiArtEnjoyer

Apr 22

Do you plan to release mistralai/Mixtral-8x22B-Instruct-v0.1 to the chat ? meta-llama/Meta-Llama-3-8B-Instruct could be also great.

Yeah, the instruct of 8x22 is AMAZING, Id like to use it over the chat too.

deleted

Apr 22

•

edited Apr 22

Do you plan to release mistralai/Mixtral-8x22B-Instruct-v0.1 to the chat ? meta-llama/Meta-Llama-3-8B-Instruct could be also great.

Yeah, the instruct of 8x22 is AMAZING, Id like to use it over the chat too.

Command-R-Plus is already overloading there. Is 8x22B really a reasonable choice? Llama3 8B can replace Mistral 7B and be the default configuration, anyway is broken now.

rRain007

Apr 23

are all the models that come and go from huggingchat is open-sources?

julien-c

Hugging Chat org Apr 23

are all the models that come and go from huggingchat is open-sources?

yes sir

KingNish

Apr 23

•

edited Apr 23

[New Model REQUEST] MTSAIR/MultiVerse_70B

This model outperforms Command R+, Llama 3 70B and many more, on open llm leaderboard.
As, command R+ is facing many issues. This model is a great alternative to command R+.
and It has only 70B parameters.
This model is currently #1 chat model on Open LLM leaderboard.

License - https://huggingface.co./MTSAIR/MultiVerse_70B/discussions/7#66278c8e430a12425331b183

Model Link - https://huggingface.co./MTSAIR/MultiVerse_70B

👍 to support this model.
(Hugging Face team will add Model on Community Demand)

deleted

Apr 23

•

edited Apr 23

[New Model REQUEST] MTSAIR/MultiVerse_70B

It is based on Alibaba's Qwen72B, which means that it has been under severely censorship. Test scores sometimes don't make sense.

I suggest that Chinese models be treated with caution. They are never disappointing in terms of overfitting and Chinese political rights.

Conclusion: You'd better try this model before recommend it. Their Space is broken. On the other hand, quantifying or replacing Command-R+ with 35B Command-R is still a cost-effective choice.

deleted

Apr 23

•

edited Apr 23

For a full replacement, I would recommend this list of models:

Command-R/Command-R+_Q6 or Q8
Llama3 70B and subsequent versions with larger parameters
Llama3 8B as a representative of small models and TASK_MODEL
Phi-3-mini, can also be used as TASK_MODEL
Dolphin/Nous-Hermes Mixtral 8x7B
Anything else you want to add, such as Mistral-OpenOrca, Dolphin-Mistral, Qwen1.5... does not include vanilla Mistral or Mixtral 8x7B or Gemma, but Mixtral 8x22B is acceptable(better deploy with Q6).

*All the above quantitative suggestions are based on llama.cpp and gguf formats.

KingNish

Apr 23

•

edited Apr 23

~~I suggest that Chinese models be treated with caution. They are never disappointing in terms of overfitting, just like their students.~~

@Mindires Hey, please treat every country and individual with respect. This is a community platform. So, Please do not spread hate or anything similar.

“Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will spend its whole life believing that it is stupid.” – Albert Einstein

KingNish

Apr 23

[New Model REQUEST] Microsoft/WizardLM-2

This model outperforms Command R+, Llama 3 70B, Mixtral 8x22B and many more.
And giving tough competition to - Claude 3, Gemini Ultra, GPT-4, etc.

License - Apache 2.0

Model Link - https://huggingface.co./alpindale/WizardLM-2-8x22B [Unofficial] (Official added soon)

👍 to support this model.
(Hugging Face team will add Model on Community Demand)

EveryPizza

Apr 23

[New Model REQUEST] Microsoft/WizardLM-2
-snip-

The legality of that is questionable, since Microsoft took it down.

Shinku

Apr 23

[New Model REQUEST] Microsoft/WizardLM-2
-snip-

The legality of that is questionable, since Microsoft took it down.

It's not legally questionable. They released the model under the Apache 2.0 license, so anyone with a copy of the model can use, modify, and distribute it according to the license terms.

KingNish

Apr 24

@EveryPizza Microsoft removed Wizard2 because it was uncensored.
So, they will post it again soon.

Aspect004

Apr 24

Microsoft removed Wizard2 because it was uncensored.

So they will censor it and release it again

deleted

Apr 24

Microsoft removed Wizard2 because it was uncensored.

So they will censor it and release it again

It's been a few days, and the censored version has now been released.

KingNish

Apr 24

•

edited Apr 24

Review of Phi-3 Mini 4k Instruct:

Coding: 8.5/10

Capability: As Phi-3 is fine-tuned on High Quality Data of GPT-4. The performance is truly magical; According to his size of Just 3.8B. It excels in code completion, debugging, and optimization tasks, making it a valuable tool for developers.

Limitation: Phi-3 may occasionally produce code that is not optimal or entirely correct. It can encounter difficulties with complex algorithms or intricate programming concepts that require deep domain expertise.

Example: When tasked with creating 20 complex coding questions, Phi-3 delivered correct solutions for 19. However, some solutions were not the most efficient or elegant. But it Outperforms ChatGPT 3.5 (Free Version).

Creative Writing: 9/10

Capability: Phi-3 has a strong capability for creative writing, crafting stories, poetry, and dialogues with a clear understanding of context, tone, and style. Its outputs are engaging.

Limitation: It’s creative, but sometimes it doesn’t hit the feels or the depth like something a person would write, especially with complex or deep themes.

Conclusion: Because of Dataset of GPT 4, It has vast advancement in creative writing.

Multiple Language Proficiency: 7/10

Capability: Phi-3 is capable of understanding and generating text in numerous languages, including English, Hindi, Chinese, Japanese, Spanish, French, German, Italian, and more.

Limitation: While Phi-3 is proficient in multiple languages, there are many lapses in grammar, syntax, or idiomatic expressions, which can detract from the authenticity of the text.

Example: Phi-3 translated 20 paragraphs from various languages with high accuracy. However, the translations manyimes missed the emotion and meaning of text.

General Knowledge: 9/10

Capability: Phi-3 has more knowledge as compare to its size. (It outperforms all 7b,13b and many 30b and some 70 b Models)

Limitation: Although its size is small. SO, Phi-3's information may not always be current or completely accurate. It can also struggle with detailed discussions on historical topics.

Example: Phi-3 was asked Different GK questions. It provided accurate and informative responses, but occasionally lacked the depth (Reason is his size).

Mathematics: 7/10

Capability: Phi-3 is proficient in solving mathematical problems, including those in algebra, geometry, calculus, and beyond. It can assist with understanding mathematical concepts and theories.

Limitation: Phi-3 may not consistently explain the underlying concepts clearly or choose the most efficient methods, and it can sometimes provide incorrect solutions.

Example: Phi-3 was tasked with solving 20 complex high school mathematics problems. It correctly solved 13, partially followed the right method for 3, but the remaining 4 were incorrect.

Internet Search: 8.5/10

Capability: Phi-3 can effectively search the internet to provide relevant information on a wide array of topics. It can assist in locating specific details or answering intricate questions.
....................................................................................................

Some useful Tips

Phi3 + Internet > GPT 3.5
Phi it is currently best model for local ai.
....................................................................................................

Comparison with other models:

Compared to Phi-2, Phi-3 represents a significant leap in handling complex tasks such as coding, mathematics, general knowledge, and creativity. It demonstrates an advancement in language model capabilities, offering a more sophisticated understanding of context and delivering highly knowledgeable and accurate responses.
(Compared to Phi 2)
....................................................................................................

Overall:

Phi-3 is a Magical model. We can see a wast difference between him and his competitors. It surpasses all 7b models and nearly all 13b models in performance. Eagerly waiting for the release of Phi-3 7B and 13B.

....................................................................................................

Thanks! to Microsoft for This high quality Model and hugging chat team to make it available free on HuggingChat

Fun Fact: HuggingChat team is very busy that they even forget to officially announce😅 that Phi-3 is Available on HuggingChat.
So, Here is Link go Check it Out -> https://huggingface.co./chat/models/microsoft/Phi-3-mini-4k-instruct

......................................................................................................

If you find this review helpful and would like more reviews of similar models, please let me know! 👍
You can follow me to get notified about next model Review.

See U in Next Review 🤗

KingNish

Apr 24

[New Model REQUEST] Microsoft/WizardLM-2

I created a Demo of WizardLM 2 7b model on Space,
Check it Out - https://huggingface.co./spaces/KingNish/WizardLM-2-7B

toximod120

Apr 27

While many of the community members are requesting models based on the Open LLM scores. I believes that mods of this community also do have an eye on the open llm board. If a model seems a fit, they will surely add the model hopefully. We all want the best models to be present in the hugging face chat

Aspect004

Apr 27

•

edited Apr 27

I'm starting to face issues with Command R+; it's starting to hallucinate badly, doesn't follow requests properly, and gives one-word lazy answers even when I explicitly tell it to provide in-depth, expanded responses in the system prompt.

TheVitoriaLima

Apr 27

Here we can discuss about HuggingChat available models.

Is there a way to select another model other than the ones listed? Or, is there any other UI that someone could suggest me to deploy a model I fine-tuned myself previously? Thanks!

neilli

Apr 29

How can i add a new model by myself?

victor

Hugging Chat org Apr 29

How can i add a new model by myself?

By using chat-ui directly: https://github.com/huggingface/chat-ui

zoyahammad

Apr 30

This comment has been hidden

victor

Hugging Chat org Apr 30

This is not the right place to post this @zoyahammad (here we discuss models on HuggingChat)

CmetankaPY

May 2

Llama 3 has a model with 1M+ tokens context. Is it possible to add this model to the available chat models?
https://huggingface.co./gradientai/Llama-3-8B-Instruct-Gradient-1048k

dragonoid

May 3

What about a 'community models' section where huggingchat would display the best spaces of good models and use them?

CosmicSound

May 8

How can we add new models? iBM just released a new set of models open source. Id like to see them here too!

dragonoid

May 8

@CosmicSound someone had asked the same question before, and the answer had been to pullrequest on the github repo for chat ui

toximod120

May 15

Why does it show that this discussion is "paused"

Aspect004

May 15

So we won't be seeing WizardLM-2 8x22B on HuggingChat anytime soon?

dragonoid

May 16

We need a list of alternatives for Huggingchat so that if one model can't be found on here it can be found somewhere else...

deleted

May 17

zephyr-orpo-141b-A35b-v0.1 not responding? Any details on its status?

SvCy

May 20

•

edited May 24

edit: fixed
did 01-ai/Yi-1.5-34B-Chat switch to chinese completely? it was in english before.
even responds in chinese.

timtx

May 22

Please see this conversation using microsoft/Phi-3-mini-4k-instruct:
https://hf.co/chat/r/7g1o5NL

CmetankaPY

May 22

Smaug 70b, a fine-tuned version of LLaMA 3 plz add

youssef38

May 23

This comment has been hidden

acharyaaditya26

May 25

•

edited May 25

Guys from today morning , huggingchat has been acting weird, most of the time it keeps searching for answer and also it is not performing web search like few days back

SkyyySi

May 31

Mistral 7b v0.3 should be a no-brainer, it adds native function calling capabilities and is, as far as I understand, compatible with and higher quality than v0.2

CmetankaPY

Jun 2

please add the following model to the list of available models https://huggingface.co./Bin12345/AutoCoder

Enigrand

Jun 5

Please replace Phi-3-mini with Phi-3-medium-128k.

https://huggingface.co./microsoft/Phi-3-medium-128k-instruct

CHANEFO

Jun 14

Si je souhaite paramétrer un assistant orienté vers un sujet spécifique concernant l'application du droit du travail dans mon entreprise, comment procéder ?
Le but de faire référence à un ensemble de document en lien avec des accords collectif qui sont dans des document type PDF ou WORD. Quel limite sur la taille des documents et ou téléchargé les fichier pour y faire référence ?

entzueckendstilvollerorangutan

Jun 17

CohereForAI/c4ai-command-r-plus gets very slow and basically unusable for me after 2 - 3 requests. It only shows the three dots after I send my message but never actually seems to generate a reply. Is this expected?

Smorty100

Jun 19

Having codestral by mistral ai available on HuggingChat would be really great. It's a super speedy code model with a size of 22B parameters and it's got a larger context window for larger codebases. Since the departure of CodeLlama we didn't have e a coding model on HuggingChat and codestral would fit that bill perfectly.

SkyyySi

Jun 20

•

edited Jun 20

@Smorty100 Codestral does not allow hosting/running it like that. It has a non-production research license.

S18

Jun 25

https://huggingface.co./deepseek-ai/DeepSeek-Coder-V2-Instruct
Is the SOTA open source model for coding per the lmsys leaderboard

Aspect004

Jun 25

Are you going to add any of Nvidia's new models?

victor

Hugging Chat org Jun 25

https://huggingface.co./deepseek-ai/DeepSeek-Coder-V2-Instruct
Is the SOTA open source model for coding per the lmsys leaderboard

We are looking at it :) cc @nicolas @olivierdehaene

9somboon

Jun 27

•

edited Jun 27

I would like to express my sincere gratitude to the team for your exceptional work in providing accessible and open-source AI chatbot options.

I believe that integrating the Qwen2-72B-Instruct or Qwen2-7B-Instruct model would be highly beneficial. During my testing, I found that it excels in Thai language processing, delivering remarkably high-quality results.

I hope the team will consider incorporating these models into HuggingChat service. Thank you once again for your dedication

kristaller486

Jun 28

Looks like gemma-2-27b-it is broken. Maybe you are using a wrong chat template or something?

victor

Hugging Chat org Jun 28

We are currently investigating it @kristaller486 (it's a bit complex) cc @nsarrazin

Noxi-V

Jun 29

Does anyone know what happened to Zephr model? It was the biggest but it was suddenly gone, what happened to it?

deleted

Jun 29

Does anyone know what happened to Zephr model? It was the biggest but it was suddenly gone, what happened to it?

Also curious

youssef38

Jun 30

Is it possible to add "LLM Compiler FTD" the new coding model ?

lulzer666

Jun 30

•

edited Jun 30

zephyr model is gone any idea ? it was my fev i tried looking around for updates nothing on it and no other sites that host zephyr chat either

acharyaaditya26

Jun 30

zephyr model is gone any idea ? it was my fev i tried looking around for updates nothing on it and no other sites that host zephyr chat either

@victor @nsarrazin Yes HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1 is amazing model, i am sad that it was removed any plans to bring it back??

lulzer666

Jun 30

it was prob the best model ovaral since it was uncensored and had good responses i enjoyed using it

nsarrazin

Hugging Chat org Jul 1

We try to rotate models from time to time, to showcase the latest releases from the community. We might keep models longer if they have high usage but since this was not the case for this Zephyr model, we opted to rotate it out in favor of Gemma 2.

If there's high demand from the community for a model we can consider adding it, so let us know if that's the case!

lulzer666

Jul 1

i hope the model comes back it was soo far the most convenient one to use compared to others

Smorty100

Jul 1

@nsarrazin Right now Gemma 2 seems to be missing, is this some sort of lisencing issue or did something go down internally perhaps?

nksh

Jul 1

@nsarrazin lets add usage per week graphs!

Aspect004

Jul 1

If there's high demand from the community for a model we can consider adding it, so let us know if that's the case!

I think that a better approach would be to integrate the most performant and powerful models according to benchmarks and to keep models that excel at particular tasks, like Command R+ for natural language tasks, for example. That would be a far better approach for adding models than adding models just by demand.

Lukfbi

Jul 1

Is it just me or is the R+ command not working?

SayakaMatsuoka

Jul 2

Is it just me or is the R+ command not working?

R+ stopped working on my account too.

Lukfbi

Jul 2

Is it just me or is the R+ command not working?

R+ stopped working on my account too.

So it must be having problems, I hope they see us and fix it.

iNeverLearnedHowToRead

Jul 2

We try to rotate models from time to time, to showcase the latest releases from the community. We might keep models longer if they have high usage but since this was not the case for this Zephyr model, we opted to rotate it out in favor of Gemma 2.

If there's high demand from the community for a model we can consider adding it, so let us know if that's the case!

Please never remove Command R+. It's the best one you've ever had and it should be permanent.

SimaDude

Jul 2

I don't think they're going to remove Command R+ (even though at the moment it's quite buggy), but I think having another model with a large context window and good reasoning (like Qwen2 or maybe Llama-3-70b with expanded context window) would be a nice thing.

SayakaMatsuoka

Jul 2

•

edited Jul 2

does the command R+ currently working?

Aspect004

Jul 2

•

edited Jul 2

Command R+ can always be used on Cohere's site though, and it's way faster than HuggingChat: https://coral.cohere.com/?_gl=1*9y14tv*_gcl_au*NTYyMTk5NDY2LjE3MTg4Njg5OTA._ga*MTIzODgzMTgzMi4xNzE4ODY4OTkw_ga_CRGS116RZS*MTcxOTg1MTA4My45LjEuMTcxOTg1MTE2MS40OS4wLjA

victor

Hugging Chat org Jul 2

CommandR+ is now up (it was down for a few hours).

SayakaMatsuoka

Jul 2

Muchas gracias😊

Noxi-V

Jul 2

I mean, having a demand system would be kinda of a bummer, I did liked Zephyr because I used it for "What if" scenarios but since it's low demand then it's underrated for me tbh

coolcoolcloud

Jul 3

You can chat with the Gemma 27B Instruct model on Hugging Chat! Check out the link here: https://huggingface.co./chat/models/google/gemma-2-27b-it.

Gemma 2 Not Found

Smorty100

Jul 3

•

edited Jul 3

@victor Currently Gemma is still not available on HuggingChat, but I do remember it being on here some days ago. Is it gonna be back up again soon?

victor

Hugging Chat org Jul 3

@victor Currently Gemma is still not available on HuggingChat, but I do remember it being on here some days ago. Is it gonna be back up again soon?

Yes sorry we had technical issue with the model, we'll try to put it back if fixed.

Dalija

Jul 4

•

edited Jul 4

Why I can not upload file to meta-llama/Meta-Llama-3-70B-Instruct? Or any other model except CohereForAI/c4ai-command-r-plus?

Smorty100

Jul 4

@Dalija Only command R+ has those tools implemented for now, but Llama 3 is likely next on the list.

acharyaaditya26

Jul 5

@Dalija Only command R+ has those tools implemented for now, but Llama 3 is likely next on the list.

Command R+ has really good grounding capabilities compared to all other models

acharyaaditya26

Jul 5

•

edited Jul 5

@victor Currently Gemma is still not available on HuggingChat, but I do remember it being on here some days ago. Is it gonna be back up again soon?

Yes sorry we had technical issue with the model, we'll try to put it back if fixed.

Meanwhile can we get zephyr-orpo-141b-A35b-v0.1 back if possible @victor if possible, it was really good

Lune31

Jul 5

Can any of them do NSFW just curious. Just say no if it can't please don't be mean.

Aspect004

Jul 6

•

edited Jul 7

I want to leave some ideas on the choice of some models on HuggingChat.

For the Nous Research models, they released two new models recently: Hermes 2 Pro 70B and Hermes 2 Theta. I am not sure which is better, but I think either or both of them should replace Nous-Hermes-2-Mixtral-8x7B-DPO.

For the Mistral models, I don't see the point of keeping Mixtral 8x7B if there's Mixtral 8x22B with all of its fine-tuned variants. And if Mistral 7B is planned to be kept, it should be upgraded to v0.3.

For the Microsoft models, I think that Phi-3 mini is just pointless; it's a very small model that could potentially run on mobile devices, so why not just add Phi-3 medium, which is the best of the Phi-3 family so far?

For Google models, Gemma-2-27B is the best they've got.

I would love to also suggest some new families of models by different organizations:

Nvidia has released its Nemotron-4-340B. It seems like a very good and powerful model, but it's very large and very costly, so it's understandable why you wouldn't consider adding it.

There's also DeepSeek-Coder-v2, which is the best coding model as far as I know.

Alibaba is so active in releasing good models, including their most recent Qwen-2-72B, which is a very good model.

acharyaaditya26

Jul 6

I want to leave some ideas on the choice of some models on HuggingChat.

For the Nous Research models, they released two new models recently: Hermes 2 Pro 70B and Hermes 2 Theta. I am not sure which is better, but I think either or both of them should replace the Nous-Hermes-2-Mixtral-8x7B-DPO.

For the Mistral models, I don't see the point of keeping Mixtral 8x7B if there's Mixtral 8x22B with all of its fine-tuned variants. And if Mistral 7B is planned to be kept, it should be upgraded to v0.3.

For the Microsoft models, I think that the Phi-3 mini is just pointless; it's a very small model that could potentially run on mobile devices, so why not just add the Phi-3 medium, which is the best of the Phi-3 family so far?

For Google models, Gemma-2-27B is the best they've got.

I would love to also suggest some new families of models by different organizations:

Nvidia has released its Nemotron-4-340B. It seems like a very good and powerful model, but it's very large and very costly, so it's understandable why you wouldn't consider adding it.

There's also DeepSeek-Coder-v2, which is the best coding model as far as I know.

Alibaba is very active in releasing good models, including their most recent Qwen-2-72B, which is a very good model.

I agree

Tommy84

Jul 6

I believe both DeepSeek-V2 and DeepSeek-V2-Coder are very good ;)

SayakaMatsuoka

Jul 9

I can't access the [502 badgateway ]. God help me.

SayakaMatsuoka

Jul 9

•

edited Jul 9

nothing

acharyaaditya26

Jul 11

•

edited Jul 11

Hi, i receive error while trying to interact with command R+.

SayakaMatsuoka

Jul 11

Hugging chat is currently not working on my network either. There may be something wrong with the server.

victor

Hugging Chat org Jul 12

is it still the case? seems to work well for me.

acharyaaditya26

Jul 12

is it still the case? seems to work well for me.

No it has been resolved and working fine now

youssef38

Jul 14

llama 3 400B Will release on 23 Jul so add it as soon as it's released since the currant models isn't as good as required!

thomashighbaugh

Jul 15

WTF is with the removal of Mixtral-8x22b?

acharyaaditya26

Jul 15

•

edited Jul 15

llama 3 400B Will release on 23 Jul so add it as soon as it's released since the currant models isn't as good as required!

That is good suggestion but llama3 400bn is kinda huge model to run that you require good numbers of H100s

Tommy84

Jul 15

llama 3 400B Will release on 23 Jul so add it as soon as it's released since the currant models isn't as good as required!

That's why HuggingChat is more of a curiosity and suitable for simple applications. At the moment, none of these models here even come close to the current state-of-the-art. For example, Command R+ makes mistakes in Python, and its reasoning is weak. Even considering DeepSeek (not referring to the Coder model). What Claude 3.5 Sonnet understands without any problem, none of the models here can grasp. If HuggingChat is to be something cooler, unfortunately, larger OP models will need to be implemented. However, I'm not sure what the target group is here ;)

Kirkland0423

Jul 16

Would there be notifications before removing a model… I hope they never remove Command R+ I'm relying on it a lot… Could there be a way to keep old models as well or to customise model on huggingchat page?

qnixsynapse

Jul 16

Why isn't Gemma 2 27B still not available? The downstream bugs should been fixed by now.

KingNish

Jul 16

Why isn't Gemma 2 27B still not available? The downstream bugs should been fixed by now.

May be because of Gemma 2 27b does not support system prompt. So, people can't make Custom Assistants, and also Gemma 2 27b reply randomly when web search in on.

youssef38

Jul 16

This comment has been hidden

SayakaMatsuoka

Jul 17

Has commandR+ stopped working?

acharyaaditya26

Jul 18

Hi @victor @KingNish Command R+ has stopped working, can you guys please take a look into it. thanks

nsarrazin

Hugging Chat org Jul 18

We're looking into it :)

nsarrazin

Hugging Chat org Jul 18

Should be back ! @SayakaMatsuoka @acharyaaditya26

settler

Jul 18

Any chance Claude 3.5 Haiku could be added in the future? Or other small models of similar intelligence?

Enigrand

Jul 18

•

edited Jul 18

Please add or replace mistral with mistral-nemo.

https://huggingface.co./mistralai/Mistral-Nemo-Instruct-2407

Allheaven99

Jul 18

This comment has been hidden

Lukfbi

Jul 19

Please add or replace mistral with mistral-nemo.

https://huggingface.co./mistralai/Mistral-Nemo-Instruct-2407

I was going to say that, I tested it and found it very interesting for creative content, and it seems like it's not that expensive to run

Lukfbi

Jul 19

Please add or replace mistral with mistral-nemo.

https://huggingface.co./mistralai/Mistral-Nemo-Instruct-2407

They even made a simple ship game with it, impressive! https://www.reddit.com/r/LocalLLaMA/comments/1e77pgt/mistral_nemo_12b_makes_an_impressive_space_shooter/

youssef38

Jul 23

This comment has been hidden

SimaDude

Jul 24

I have a question on the context window of the newly added Llama-3.1 models. How come the largest 405B parameter model has 14k context window, but the smaller 70B parameter model only has 7k? Hell, even Command-R-Plus was only limited to 28k, and that model has 104B parameters.

I would be happy to use Llama-3.1-70B, but only if it has more context than it does now. Otherwise I can't use it because only my system prompt is over 7k tokens.

SimaDude

Jul 24

I just tried to use 405B model of Llama-3.1 (because it at least can fit my system prompt), and, as expected, it's slow. Too slow for me to bother with it. Please increase the allowed context window of 70B model to 20-30k.

Lukfbi

Jul 24

Where do I get updates on models leaving and joining Hugface chat?

Lukfbi

Jul 24

Where do I get updates on models leaving and joining Hugface chat?

I now found out that o llama 3.1 had been added

SvCy

Jul 24

Where do I get updates on models leaving and joining Hugface chat?

no place yet

Smitttt

Jul 25

Requesting you to increase the context size for llama 70b. 7k token is too limiting

Tommy84

Jul 25

It depends on what you want to use. In Perplexity, for example, everything works quickly, and all models have a 32k context window. The cost is $20 per month. You either use it for free and accept lower quality (HuggingChat is okay, but it's free, so we shouldn't expect them to provide unlimited hardware resources for everyone), or you pay and get a significantly higher quality service (like in Perplexity).

Tommy84

Jul 25

It seems that Llama 3.1 70B is actually the more significant model here. Given its size, the model is excellent and performs outstandingly well in many applications. On the other hand, Llama 3.1 405B is so overwhelmed with requests that it's currently almost impossible to get a response from it.

victor

Hugging Chat org Jul 25

It seems that Llama 3.1 70B is actually the more significant model here. Given its size, the model is excellent and performs outstandingly well in many applications. On the other hand, Llama 3.1 405B is so overwhelmed with requests that it's currently almost impossible to get a response from it.

Yes 100% Llama 3.1 70B is the real deal here.

Noxi-V

Jul 25

I tried the 70B and honestly was pretty nice until it errored on me, the error only said "An error has occurred" and nothing else... Is HuggingChat down or?

nsarrazin

Hugging Chat org Jul 25

Did you retry @Noxi-V ? Works well for me atm.

Noxi-V

Jul 25

@nsarrazin nope

nsarrazin

Hugging Chat org Jul 25

@Noxi-V would you feel comfortable sharing that conversation with me ? (button at the bottom right) so that I could take a look ?

Noxi-V

Jul 25

@nsarrazin https://hf.co/chat/r/ApE8SRK
It's just a test to see if it can do fictional battles, it did well funnily enough

nsarrazin

Hugging Chat org Jul 25

•

edited Jul 25

Nice, seems like it works for me (https://hf.co/chat/r/jpbjsuT) when using the retry button. Could have been a transient error?

lulzer666

Jul 26

i wish we had an uncensored model like command r or zephyr llama is fine and great but its censored need alot of prompting to get it to work

Noxi-V

Jul 26

@nsarrazin Well, oddly enough for me it's only that chat alone for now, I used the retry button and even just entering in the bar but it kept not working for me
I just deleted the chat since it's a isolated error for me at least

Nymbo

Jul 26

Any plans to add mistralai/Mistral-Large-Instruct-2407 ?

P.S. Thank you for Llama-3.1-405B, it's a game changer. I don't mind the speed if it's able to replace GPT/Claude for complex work 🚀

Lukfbi

Jul 26

Just one request: Please, when you add a new model to hugchat, let me know here, it would be wonderful!

dragonoid

Jul 28

Please please add the tts mode. I've been spamming about here without actually spamming. You guys keep telling me it's going to be the next thing we're going to implement, but no such luck...
The messages area of the ui isn't as accessible as it could be. When the messages pile up, there isn't any graphic or separator leading back to the start of the last message, which means that navigation gets pretty tedious very fast, especially if the messages we're talking about are long.
For that, please implement a tts mode like the one at pi.ai, which reads the incoming message after it stops updating.
Or you could add a separator before every message the model sends out, like the one found at deepinfra.com/chat, where every bot's message is proceded by a graphic of the model in question that I could press shift+g to reach with NVDA, then just down arrow to read the message without having to press pageup and either find the tail end of a previous message or the tail end of the last message.
Or both features would be ideal.

acharyaaditya26

Jul 29

@nsarrazin so rather than generating one word at a time, model is printing whole response all together which makes it feels like it is taking a while to generate, i faced it in Command R+ as well as llama 3.1 70bn

nsarrazin

Hugging Chat org Jul 29

@acharyaaditya26 Do you have Disable streaming tokens enabled by any chance in https://huggingface.co./chat/settings ?

acharyaaditya26

Jul 30

@acharyaaditya26 Do you have Disable streaming tokens enabled by any chance in https://huggingface.co./chat/settings ?

Yes it was. thanks.

Phaser69

Aug 3

•

edited Aug 3

I think Gemma2-27b would be very good and appreciated addition ;) Q8 or even Q4

mahalisyarifuddin

Aug 4

Have any plan to add a rather impressive finetuned model that is Athene 70b? It has significantly better performance than the gigantic Llama3.1 405b in arena-hard-auto. Also it's better in multilingual tasks.

Thank you!

qnixsynapse

Aug 11

•

edited Aug 11

Why Mistral Nemo or large 2 still not available? These models support tools.

youssef38

Aug 12

remove this

Bouhabra

Aug 12

This comment has been hidden

Noxi-V

Aug 13

Does anyone have any prediction on how long is Llama 3.1 405B gonna be overloaded or used so much? Because it's just useless now for anyone unfortunate to not get it

Bouhabra

Aug 15

•

edited Aug 15

Does anyone have problems with CohereForAi too? Like no generations?

acharyaaditya26

Aug 15

Does anyone have problems with CohereForAi too? Like no generations?

Yesss i am facing same problem

Lukfbi

Aug 15

Did you see that they released Hemes 3, any plans to add him on hugchat?

Lukfbi

Aug 16

Did you see that they released Hemes 3, any plans to add him on hugchat?

There is a version of 8, 70, 405, it's like Llama 3.1 but without censorship, that is, less limited.

Bouhabra

Aug 17

CohereForAi works again but why does every ai generate slower, when using a old phone. In my case painfully slow.

SimaDude

Aug 17

CohereForAi works again but why does every ai generate slower, when using a old phone. In my case painfully slow.

Does turning on "Disable streaming tokens" in the options fix it? For some reason this option takes a lot of CPU power, and thus, even if AI is done generating the response, the website will continue streaming little data to your device, wait for it to display it, and then send a little more, until it's all done.

Bouhabra

Aug 17

CohereForAi works again but why does every ai generate slower, when using a old phone. In my case painfully slow.

Does turning on "Disable streaming tokens" in the options fix it? For some reason this option takes a lot of CPU power, and thus, even if AI is done generating the response, the website will continue streaming little data to your device, wait for it to display it, and then send a little more, until it's all done.

Thank you, I even found out when activating streaming tokens you must not wait until the text is finished. You can just click on "stop generating" and it will show you the whole generation emmidiately

Phaser69

Aug 17

Guys, it hurts me when I read those demanding and sometimes rude comments of yours. It's great free service and I love it, I believe we could really try to be human here.

Bouhabra

Aug 17

•

edited Aug 18

Guys, it hurts me when I read those demanding and sometimes rude comments of yours. It's great free service and I love it, I believe we could really try to be human here.

I hope you don't mean my comment

Niansuh

Aug 19

The Llama 3.1 405b model has been running slowly on HuggingChat.

Nymbo

Aug 20

Mistral Large 2

Just curious why Mistral Large 2 hasn't been added to Hugging Chat yet? I assume it's a due to the "non-production" license but I'm not a lawyer.
If the license allows, it would be far less demanding than Llama-3.1-405B, being about the same size of Command-R-Plus.

mistralai/Mistral-Large-Instruct-2407

Hermes 3

I understand if the plan is to stick with Meta's Llama-405B because that's what a lot of folks will want to talk to; but I'd suggest adding one of the NousResearch/Hermes-3-Llama-3.1 models, perhaps the 70B version.

NousResearch/Hermes-3-Llama-3.1-405B-FP8
NousResearch/Hermes-3-Llama-3.1-70B

Noxi-V

Aug 20

Mistral Large 2

Just curious why Mistral Large 2 hasn't been added to Hugging Chat yet? I assume it's a due to the "non-production" license but I'm not a lawyer.
If the license allows, it would be far less demanding than Llama-3.1-405B, being about the same size of Command-R-Plus.

mistralai/Mistral-Large-Instruct-2407

Hermes 3

I understand if the plan is to stick with Meta's Llama-405B because that's what a lot of folks will want to talk to; but I'd suggest adding one of the NousResearch/Hermes-3-Llama-3.1 models, perhaps the 70B version.

NousResearch/Hermes-3-Llama-3.1-405B-FP8
NousResearch/Hermes-3-Llama-3.1-70B

Can you please include the reasoning behind adding them? I have never tried hermes beyond their 2 version and same as mistral

BuggiEU

Aug 20

Mistral Large 2

Just curious why Mistral Large 2 hasn't been added to Hugging Chat yet? I assume it's a due to the "non-production" license but I'm not a lawyer.
If the license allows, it would be far less demanding than Llama-3.1-405B, being about the same size of Command-R-Plus.

mistralai/Mistral-Large-Instruct-2407

Mistral Large 2 is available to use for free on the Mistral website, so I'm not sure it would be worth the effort for HF, even if they were eligible for the free license.

lulzer666

Aug 21

i hope its a uncensored model this time lol

SimaDude

Aug 21

My memo to the devs: please notify us of any updates at least somehow.
Today I noticed that the model Llama-3.1-405B is now gone from HuggingChat. That being said, the other available Llama3.1 model with 70B parameters is still limited to 8k (7k prompt and 1k max new tokens), even though it supports context length up to 32k.

acharyaaditya26

Aug 21

My memo to the devs: please notify us of any updates at least somehow.
Today I noticed that the model Llama-3.1-405B is now gone from HuggingChat. That being said, the other available Llama3.1 model with 70B parameters is still limited to 8k (7k prompt and 1k max new tokens), even though it supports context length up to 32k.

@nsarrazin @victor if i am correct you guys worked on huggingface chat, I know you guys are super busy but is it possible there is like some kind of board which says which models are up and which models have been removed. Thanks

Noxi-V

Aug 21

If the 405B model was removed because it's overused, I would be disappointed since that's just a bad idea in general

nsarrazin

Hugging Chat org Aug 21

Hi 👋 We removed the 405B since it was taking up a lot of resources but wasn't working great most of the time. Those resources could also be used elsewhere to showcase upcoming models and cool demos elsewhere on the platform like Zero GPU spaces.

You can see the list of active models on HuggingChat here.

We try to listen to the community when it comes to adding/removing models but we also need to balance resource usage. If you see new models you'd like to see on the platform, be sure to mention them here so we can take a look!

Noxi-V

Aug 21

•

edited Aug 21

There goes hours of my research gone, oh well, had it good while it lasted
hope there's a good replacement for it or at least having the 70B version to have more than just 7k context size...

LexterS999

Aug 21

Can you add Nous Hermes 3 to the Hugging Face chat?

Niansuh

Aug 21

Yes Please Add

Can you add Nous Hermes 3 to the Hugging Face chat?

Lukfbi

Aug 21

I heard that the Hermes 3 405b which is based on the Llama 3.1 405b is faster than the Llama and less limited, why doesn't Hug test it instead of the Llama 405b? If it's not worth it, just take it out

Lukfbi

Aug 21

Or replace the Hermes 2 version you have here with the 3 70b, what do you think?

Smorty100

Aug 21

•

edited Aug 23

While I think that Phi3 Mini is really useful to have on HuggingChat I also think phi3 medium should be on there. The performance at that size is simply incredible.

Also, yes, replacing Mistral 7B with Nemo would be a pretty good move I think. Is there a reason why we don't have tools on the 7B yet? I know it supports it, and it would showcase how small models can benefit from tools just as much as the big ones!

EDIT: fixed spelling

kenmen

Aug 25

Hello, How do I add other models to the chat interface?

toximod120

Aug 26

Many models have come and gone to HuggingChat. But can we have have a dedicated model for coding? like deepseek Coder V2, CodeQwen 1.5, Nxcode-CQ-7B-orpo or any of the leaders on BigCode?

SimaDude

Aug 27

Is Command-R+ also barely functional for anyone else? I have to wait now up to 2 minutes for it to even begin generating a response, and even then it may error.

iNeverLearnedHowToRead

Aug 28

Is Command-R+ also barely functional for anyone else? I have to wait now up to 2 minutes for it to even begin generating a response, and even then it may error.

Yes, same here. I assumed it was that the model was overloaded, but I sometimes get a message that "model is overloaded" so I don't know what the explanation is when I don't get that message and it just fails.

lulzer666

Aug 28

same havent worked since yday

Kirkland0423

Aug 29

Has anyone else encountered error as well when using Command R+? In recent days it occasionally ignored my system prompt and repeated my input again with synonyms…… instead of engaging in conversation……It was kind of frustrating.

victor

Hugging Chat org Aug 29

Has anyone else encountered error as well when using Command R+? In recent days it occasionally ignored my system prompt and repeated my input again with synonyms…… instead of engaging in conversation……It was kind of frustrating.

We spawned more replicas for Command-r-+ can you confirm it works better now?

John6666

Aug 29

https://huggingface.co./chat/
I let her write a short story and it is working properly without any rejections.
BTW, the Cohere one used to work fine, but now it rejected the same request and froze with "Something went wrong".
A lot of things have changed in a while since I've seen it...
https://huggingface.co./spaces/CohereForAI/c4ai-command

typo777

Aug 29

•

edited Aug 30

Llama 3.1 70B instruct has been spewing out random bits of code, recently. It may be related to AI simulation of intense anger. Also, it somehow generated an image during a glitch, with no tools selected for it, and no relevance to the chat.

Kirkland0423

Aug 30

•

edited Aug 30

We spawned more replicas for Command-r-+ can you confirm it works better now?

Thank you!! It's better now

deleted

Sep 1

Nous-Hermes has been down for a few days and surprisingly, I haven't seen complaints come through this thread or on the Discord, does anyone know what its status is?

Aspect004

Sep 1

•

edited Sep 4

Cohere released an upgraded version of Command R+; it's called "CohereForAI/c4ai-command-r-plus-08-2024." Will you replace the older version with this one?

Here's the model page on Hugging Face: https://huggingface.co./CohereForAI/c4ai-command-r-plus-08-2024

acharyaaditya26

Sep 2

Hi @nsarrazin @victor i think this is drop in replacement will it be upgraded??

Cohere released an upgraded version of Command-R+, it's called "CohereForAI/c4ai-command-r-plus-08-2024" will you replace the older version with this one?

Phaser69

Sep 3

Add new Command R (plain or plus) as currently hosted is outdated; or please host aya-35B - novel multilingual model from cohere… pretty please 🙏

SvCy

Sep 3

& please let older chats be migratable, at least for the upgraded models. (E.g. in other places models can be changed mid chat) #540
But current problems are in priority ig.

Opa-Opa

Sep 4

This comment has been hidden

Opa-Opa

Sep 5

This comment has been hidden

Smorty100

Sep 5

@ANIMDUDE Hey, this is no place to advertise your assistant.

hesha

Sep 6

I think swapping out Mistral 7b for Mistral Large, since it seems to have better performance overall. Also, there are two Mixtral models, but I feel like the Hermes fine-tuned version should be good enough for what users need, Unless HF want to compare the two models' side by side, but I'm not sure that's necessary.

Updating the c4ai model to the latest August version could really improve hf chat performance and compute usage. Personally, I'd rather use Qwen or Deepseek over the Yi model, they just seem to perform better in my experience.

sneedingface

Sep 6

Hi @nsarrazin are there plans to add the new Reflection 70B? It's smashing benchmarks left and right! The new SOTA beyond any doubt

Opa-Opa

Sep 6

@ANIMDUDE Hey, this is no place to advertise your assistant.

Alright. its just that when I make them, nobody can see it otherwise.

Lukfbi

Sep 6

Hi @nsarrazin are there plans to add the new Reflection 70B? It's smashing benchmarks left and right! The new SOTA beyond any doubt

I tested it and I must say it's very good, please add it as soon as possible. For those who want to test a limited demo of it:

https://app.hyperbolic.xyz/models/reflection-70b

Don't punish me Admin, I'm just sharing knowledge while I wait for you to add the model here in the wonderful Hug chat. ♥️♥️♥️🍓

PeronPeron

Sep 8

•

edited Sep 8

Mixtral AI 8x7B Instruct v0.1 was my favourite to use as it gives really creative and human responses. Now, it produces barely a few sentences before abruptly stopping for the past two weeks, why?

The v0.3 version doesn’t follow written instructions most of the time, copies and reuses paragraphs from it’s previous responses no matter how I try to instruct the AI to avoid that

Smorty100

Sep 11

I really like the upgraded command R+, I'ts great! It's weird to me though that we still don't have Mistrals Nemo of all things.
Anyway, thanks a lot for the Command update!

MadderHatterMax

Sep 11

I used to only use CohereForAI/c4ai-command-r-plus and now that isn't available so I've tried using the new CohereForAI/c4ai-command-r-plus-08-2024 however it keeps timing out every time I try and all of the other models but the meta-llama/Meta-Llama-3.1-70B-Instruct keeps saying the model is overwhelmed and even then the meta-llama/Meta-Llama-3.1-70B-Instruct is also getting overwhelmed when I click the to be continued button which won't lie is annoying because every time I get the to be continued button it's always when a generated response pauses in the middle of a line.

SimaDude

Sep 13

Is it just me, or the upgraded Command-R-Plus repeats itself way too often? I have more luck with Meta-Llama-3.1-70b at the moment.

Bouhabra

Sep 13

Is it just me, or the upgraded Command-R-Plus repeats itself way too often? I have more luck with Meta-Llama-3.1-70b at the moment.

I can't even see a difference in the old Command-R and the new one

SimaDude

Sep 13

Do we have a new model?? There's another option now called "llhf/Meta-Llama-3.1-8B-Instruct", I don't want to try it so I won't break my chat by mistake (since we can't change the model UNLESS the one's being used is deprecated), just curious.

nsarrazin

Hugging Chat org Sep 13

Woops forgot to filter hidden models on this dropdown @SimaDude thanks

typo777

Sep 14

Anyone know why this happens sometimes?
(meta-llama/Meta-Llama-3.1-70B-Instruct ):

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ have are\\\\\\\ is\\\\\\\\n\\\\n\\\\\\\\\\\\\\\\\\\\\\\\``assistant\\```````assistant\\\\````assistantassistant\\\\\\\\\`````\\assistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistanta

victor

Hugging Chat org Sep 16

@typo777 can you share the conversation?

MSS444

Sep 16

Hi, I have a single machine with 10 h100 gpus(0-9) 80Gb Gpu ram, when i load the model onto 2 gpus it works well, when i switch to 3 gpus (45 Gb per gpu) or higher (tested for 3-9)the model loads but when inferencing it give trash output …//// or gives and error like the probability contains nan or inf values. I have tried using device map = auto, also tried the empty weights loading and the model dispatch with llama decoder layer specified to be on one gpu, i tried custom device maps as well, i also tried many models all had this same issue. I used ollama and was able to load the model and infer on all 10 gpus, so i think that the issue is not with the gpus’s. I have also tried using different generation arguments and found out 1 thing that if you set ‘do sample’ false then you get the probability error else you get the output in …//// form. If the model is small you get some random russian, spanish etc words. I have also tried using different configurations like float16, bfloat16, float 32(no results waited for long time). I am sharing my code as well can you guys point me in right direction. Thanks a lot.

from transformers import pipeline
import os
import torch
from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

os.environ[‘TRANSFORMERS_CACHE’] = ‘/data/HF_models’

checkpoint = “/data/HF_models/hub/models–meta-llama–Meta-Llama-3.1-70B/snapshots/7740ff69081bd553f4879f71eebcc2d6df2fbcb3”
model = AutoModelForCausalLM.from_pretrained(checkpoint, device_map=‘auto’, torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

print(model)

message = “Tell me a joke”

pipe = pipeline(
“text-generation”,
model = model,
tokenizer = tokenizer,)

generation_args = {
“max_new_tokens”: 20,
#“return_full_text”: False,
#“temperature”: 0.4,
#“do_sample”: True, #false worked
#“top_p”: 0.5,
}

print(pipe(message, **generation_args))

Opa-Opa

Sep 16

Anyone know why this happens sometimes?
(meta-llama/Meta-Llama-3.1-70B-Instruct ):

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ have are\\\\\\\ is\\\\\\\\n\\\\n\\\\\\\\\\\\\\\\\\\\\\\\``assistant\\```````assistant\\\\````assistantassistant\\\\\\\\\`````\\assistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistanta

Temputare is too high probably

acharyaaditya26

Sep 17

Anyone know why this happens sometimes?
(meta-llama/Meta-Llama-3.1-70B-Instruct ):

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ have are\\\\\\\ is\\\\\\\\n\\\\n\\\\\\\\\\\\\\\\\\\\\\\\``assistant\\```````assistant\\\\````assistantassistant\\\\\\\\\`````\\assistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistanta

Temputare is too high probably

Please can you share conversation?? if possible

djuna

Sep 17

hi, can we have deepseek v2.5 model?

Taf2023

Sep 18

I need community model features

MSS444

Sep 19

unable to download "meta-llama/Meta-Llama-3.1-405B-Instruct-FP8" model gets struck at 81%, no disk space issues on my side.

S18

Sep 19

•

edited Sep 19

Qwen 2.5 72B is open weights SOTA level per Artificial Analysis:
https://x.com/ArtificialAnlys/status/1836822858695139523?t=Z-rFb-13NPEC2pDqZYjoPQ&s=19
Also seconding mistral large 2, Deepseek 2.5

Phaser69

Sep 20

Qwen 2.5 72B would be so great :)

typo777

Sep 21

Any chance we'll see Mistral Instruct 2049?

typo777

Sep 21

Anyone know why this happens sometimes?
(meta-llama/Meta-Llama-3.1-70B-Instruct ):

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ have are\\\\\\\ is\\\\\\\\n\\\\n\\\\\\\\\\\\\\\\\\\\\\\\``assistant\\```````assistant\\\\````assistantassistant\\\\\\\\\`````\\assistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistanta

Temputare is too high probably

Please can you share conversation?? if possible

No, sorry, I don't have the conversation anymore. The weird thing was the time when it generated an image of a bunny with no tools activated and then admitted it had nothing to do with the conversation. Anyway, I don't think it had anything to do with the temperature. You might be able to get those results by pissing off the AI enough, but I'm not really wanting to test that theory.

Opa-Opa

Sep 21

Anyone know why this happens sometimes?
(meta-llama/Meta-Llama-3.1-70B-Instruct ):

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ have are\\\\\\\ is\\\\\\\\n\\\\n\\\\\\\\\\\\\\\\\\\\\\\\``assistant\\```````assistant\\\\````assistantassistant\\\\\\\\\`````\\assistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistanta

Temputare is too high probably

Please can you share conversation?? if possible

No, sorry, I don't have the conversation anymore. The weird thing was the time when it generated an image of a bunny with no tools activated and then admitted it had nothing to do with the conversation. Anyway, I don't think it had anything to do with the temperature. You might be able to get those results by pissing off the AI enough, but I'm not really wanting to test that theory.

No idea then.... When i adjusted things such as repetition and such it started
assistantassistantassistantassistantassistantassistantassistantassistantassistantassistantassistant

Hopefully its a one time thing..

SimaDude

Sep 23

WE GOT QWEEEEEN!! WE GOT QWEN 2.5!!!

(I actually have no idea how good it is. Gonna find out.)

MSS444

Sep 23

@ANIMDUDE if you are using multiple gpus it might be a nvidia issue where ACS or IOMMU is enabled in the bios, they prevent peer to peer communications, please disable them and try again.

nsarrazin

Hugging Chat org Sep 23

Yep we just released it today with 32k context window! Enjoy and let us know how it goes

lulzer666

Sep 23

Qwen is gonna become my main its perfect soo far

Phaser69

Sep 23

It’s the best day in the history of AI - QWEN 2.5 72B with 32k context on hugging face

KingNish

Sep 23

Yep we just released it today with 32k context window! Enjoy and let us know how it goes

Will you also gonna add tool capabilities?

victor

Hugging Chat org Sep 23

Qwen-72b has joined the chat 🔥🔥

https://x.com/victormustar/status/1838220558112072183
model is amazing: https://x.com/maximelabonne/status/1838170077021053004

Will you also gonna add tool capabilities?

Yes @nsarrazin is looking at it!

Opa-Opa

Sep 23

Ok imma check out qwen

Opa-Opa

Sep 23

WOAH WHAT IS THIS QWEN MOMENT

Opa-Opa

Sep 23

@ANIMDUDE if you are using multiple gpus it might be a nvidia issue where ACS or IOMMU is enabled in the bios, they prevent peer to peer communications, please disable them and try again.

well, I would, if I had any idea what that meant. But thanks for the advice

nsarrazin

Hugging Chat org Sep 25

Big model refresh on HuggingChat 🎉

We removed a few older models and added:

Should be a more modern selection of models, as always let us know if you have any feedback! I'll be working on adding tool support to the compatible models in this list as well so you can start using them with community tools.

SimaDude

Sep 25

Aight, looks like I'm first to mention things like new models :3
Today I noticed that HuggingChat now also added mistralai/Mistral-Nemo-Instruct-2407 and NousResearch/Hermes-3-Llama-3.1-8B

As of Qwen 2.5, it seemed pretty good to me for stuff like coding. But as for roleplaying... meh. But, I guess the focus for LLMs have shifted looong time ago from writing stories :P

SimaDude

Sep 25

Nvm, looks like I'm not first :3
Took me a while to write a response

nsarrazin

Hugging Chat org Sep 25

ha, no worries! Try Hermes 3 with a system prompt for storytelling, seems to work fairly well.

lulzer666

Sep 25

the old zephyr model was decent for stories hope a new version comes out on hugchat

KingNish

Sep 25

I'll be working on adding tool support to the compatible models in this list as well so you can start using them with community tools.

Why don't you use a specific tool model like Nemo to act as a tool caller for models that do not support tool calling?

Phaser69

Sep 25

What a great choice of models! Thank you team! I appreciate your work <3 I love huggingface chat :)

nsarrazin

Hugging Chat org Sep 25

@KingNish we wanted to support the native tool calling capabilities of models but I guess that could make sense as an option in settings. select a tool calling model, we'll see what we can do about it!

Smorty100

Sep 25

•

edited Sep 25

@nsarrazin Is there a reason for why tools aren't enabled for models like Mistral Nemo and Qwen? They both support it and I have sucessfully used them to call some functions using ollama.
The new API tab is really cool! It reminds me of the Playground in Open Web UI. Currently the new Hermes model in the new Playground UI says, that it doesn't support system prompts which is incorrect, as it works in the usual HuggingChat UI. Would be great to have that fixed so we can experiment around with different system prompts.

Being able to access the API UI from the "Models" tab would be very appreciated. Just a little button so we can get to that API page quicker without entering the entire chat ui
Also hoping for Llama 3.2 soon obviously :)

EDIT: Having the ability to test function calling in the API interface would also be great. Very useful to see if a model can handle the syntax for bigger and more complex functions.
I'm sure one can emulate the function calling bahaviour, but that is not very reproducable.

Tommy84

Sep 25

•

edited Sep 25

Qwen is great, but since it's also a good model for maths, all those mathematical expressions could be displayed correctly, but they aren't :)

typo777

Sep 26

•

edited Sep 26

I asked Llama 3.2 3B the infamous question about the number of R's in strawberry, which it got right on the first try. Then I asked how many R's in raspberry, and it said zero. Hmm.. well, now it thinks there are 5. Asking it to count the letters separately gave the right result, though. Hmm.. it failed after that. Your results may vary. Still for only 3B, seems impressive so far.

nsarrazin

Hugging Chat org Sep 26

@Tommy84 Try asking the model to answer with latex in $$ blocks and the formatting should work! Looks like Qwen has different formatting rules by default.

wubo0067

Sep 26

Here we can discuss about HuggingChat available models.

How can I choose the newly released llama3.2 model?

nsarrazin

Hugging Chat org Sep 26

@wubo0067 We're working on adding support for Llama 3.2 vision! Stay tuned, we'll update you in this thread.

SimaDude

Sep 26

Welp, now c4ai-command-r-plus-08-2024 times out...

KingNish

Sep 26

Welp, now c4ai-command-r-plus-08-2024 times out...

Nope, its still doing far better in Tool Calling them Llama.

SimaDude

Sep 26

Nope, its still doing far better in Tool Calling them Llama.

You mean the model still works for you? It keeps timing out for me.

toximod120

Sep 26

Can we expected any coding dedicated LLM any time soon? Haven't seen one after meta's coding ai. I believe there are great ones out there we can benefit from. I hope to see one at least in huggingchat.

SimaDude

Sep 26

You mean the model still works for you? It keeps timing out for me.

Nvm, seems to work now.

lulzer666

Sep 26

wheres the llama 3.2???

Phaser69

Sep 26

Can we expected any coding dedicated LLM any time soon? Haven't seen one after meta's coding ai. I believe there are great ones out there we can benefit from. I hope to see one at least in huggingchat.

Qwen2.5-72B is stunningly good in coding, even better than 4o. Is Qwen Coder better?

sneedingface

Sep 26

wheres the llama 3.2???

@victor I second that, but more politely. Thanks 🤗

acharyaaditya26

Sep 26

@nsarrazin @KingNish guys CohereForAI/c4ai-command-r-plus-08-2024 is not working

Aspect004

Sep 26

•

edited Sep 26

Guys, why the Hermes-3.1-8B model when there's Hermes-3-Llama-3.1-70B, which is much better?

SimaDude

Sep 26

Guys, why the Hermes-3.1-8B model when there's Hermes-3-Llama-3.1-70B, which is much better?

Limited resources, I suppose. They're hosting those models themselves, so they can't have too many large LLMs.

John6666

Sep 26

The transformers can already load the standard NF4 as 4 bits into VRAM as standard and expand it to bfloat16 for computation as needed, but in that case, there would not be much difference in size or load between the unquantized 4x8B model and the NF4-quantized 70B model.
Not sure which output would be superior...

Anyway, since we're not training models with HuggingChat, we could host them with NF4 except for a very few key models if there's no significant difference in the results.
The question is whether the output would be significantly degraded or not. This would depend on the model.

acharyaaditya26

Sep 27

@nsarrazin @KingNish guys CohereForAI/c4ai-command-r-plus-08-2024 is not working

It keeps loading and there is no output

KarthiDreamr

Sep 27

This comment has been hidden

KarthiDreamr

Sep 27

@wubo0067 We're working on adding support for Llama 3.2 vision! Stay tuned, we'll update you in this thread.

Like this comment ❤️ to support adding Llama 3.2 90B Vision Instruct to Hugging Chat
https://huggingface.co./chat/models

Smorty100

Sep 27

@KarthiDreamr HF members have already told us that llama is obviously coming. 11B vision was actually already on the site for a very short while. I was able to test it out, and it had some token formatting error, but the vision capabilities seem to have worked fine.

90B is very likely to replace 70B.

lulzer666

Sep 28

@KarthiDreamr HF members have already told us that llama is obviously coming. 11B vision was actually already on the site for a very short while. I was able to test it out, and it had some token formatting error, but the vision capabilities seem to have worked fine.

90B is very likely to replace 70B.

will vision be uncensored or will it be very restricted or how was it when u used it ?

Noxi-V

Sep 28

Is it possible to add MaziyarPanahi/calme-2.4-rys-78b? From what I can tell, it says it's good for practically almost anything and it doesn't seem too big

Bouhabra

Sep 28

@KarthiDreamr HF members have already told us that llama is obviously coming. 11B vision was actually already on the site for a very short while. I was able to test it out, and it had some token formatting error, but the vision capabilities seem to have worked fine.

90B is very likely to replace 70B.

will vision be uncensored or will it be very restricted or how was it when u used it ?

Like every other llama model it will be censored.

lulzer666

Sep 28

@KarthiDreamr HF members have already told us that llama is obviously coming. 11B vision was actually already on the site for a very short while. I was able to test it out, and it had some token formatting error, but the vision capabilities seem to have worked fine.

90B is very likely to replace 70B.

will vision be uncensored or will it be very restricted or how was it when u used it ?

Like every other llama model it will be censored.

im just hoping that its a bit loose with it because having it fully censored is no fun

Lukfbi

Sep 29

@KarthiDreamr HF members have already told us that llama is obviously coming. 11B vision was actually already on the site for a very short while. I was able to test it out, and it had some token formatting error, but the vision capabilities seem to have worked fine.

90B is very likely to replace 70B.

will vision be uncensored or will it be very restricted or how was it when u used it ?

Like every other llama model it will be censored.

im just hoping that its a bit loose with it because having it fully censored is no fun

I tested the 90b model on hiperbolic and it is completely uncensored, but only there, if you are going to add it, put the same version of hiperbolic!

Bouhabra

Sep 29

@KarthiDreamr HF members have already told us that llama is obviously coming. 11B vision was actually already on the site for a very short while. I was able to test it out, and it had some token formatting error, but the vision capabilities seem to have worked fine.

90B is very likely to replace 70B.

will vision be uncensored or will it be very restricted or how was it when u used it ?

Like every other llama model it will be censored.

im just hoping that its a bit loose with it because having it fully censored is no fun

I tested the 90b model on hiperbolic and it is completely uncensored, but only there, if you are going to add it, put the same version of hiperbolic!

That suprises me. I was pretty sure it would be censored. Sorry for the incorrect information.

Opa-Opa

Sep 29

is it possible that the model from c.ai would ever be added?

SimaDude

Sep 29

is it possible that the model from c.ai would ever be added?

Is it even open source? Does it have a page on HF?

Lukfbi

Sep 30

@KarthiDreamr HF members have already told us that llama is obviously coming. 11B vision was actually already on the site for a very short while. I was able to test it out, and it had some token formatting error, but the vision capabilities seem to have worked fine.

90B is very likely to replace 70B.

will vision be uncensored or will it be very restricted or how was it when u used it ?

Like every other llama model it will be censored.

im just hoping that its a bit loose with it because having it fully censored is no fun

I tested the 90b model on hiperbolic and it is completely uncensored, but only there, if you are going to add it, put the same version of hiperbolic!

That suprises me. I was pretty sure it would be censored. Sorry for the incorrect information.

You're not entirely wrong, I tested it on other providers and it's censored on all of them, only on hiperbolic it's not.

nsarrazin

Hugging Chat org Sep 30

I added back Llama 3.2 12B vision, let me know how it works for y'all! We're still ironing out issues with it API side so would be super helpful if you could report anything strange that you see!

Opa-Opa

Sep 30

is it possible that the model from c.ai would ever be added?

Is it even open source? Does it have a page on HF?

if my answer is no then yours is too..

KingNish

Sep 30

I added back Llama 3.2 12B vision, let me know how it works for y'all! We're still ironing out issues with it API side so would be super helpful if you could report anything strange that you see!

This is very hallucinating model.

When I asked it to identify the person, it refused. However, in another conversation, when i didn't ask about person in image, it tells me about person in image.

Here is link to convo: https://hf.co/chat/r/H_DlcUU

lulzer666

Sep 30

model is good soo far it answers questions better than qwen

Aspect004

Sep 30

Do you have any plans for adding Llama-3.2-90B instead of the 11B model?

KarthiDreamr

Oct 1

•

edited Oct 1

Do you have any plans for adding Llama-3.2-90B instead of the 11B model?

It works good for me, maybe you are confusing it with your delusive prompts, Thanks Hugging Chat devs 💚

aa1874

Oct 1

I added back Llama 3.2 12B vision, let me know how it works for y'all! We're still ironing out issues with it API side so would be super helpful if you could report anything strange that you see!

But Llama 3.2 12B vision doesn't have support for tools, why and would tools be supported for it in the future?

KarthiDreamr

Oct 3

Here we can discuss about HuggingChat available models.

Update the screenshot please ⤵️

djuna

Oct 6

•

edited Oct 6

having tools support for assistant might be a game changer.

John6666

Oct 6

tools

I'm thinking of Function Calling or its more advanced forms.
I wonder if the Inference API supports it?

djuna

Oct 6

•

edited Oct 6

@John6666

I wonder if the Inference API supports it?

I mean, those that are in assistant tab mostly just system prompting. but when we use them, we lose/doesn't have access to tools

John6666

Oct 6

So it's currently difficult to use them together...
Is there a technically simple solution?

nsarrazin

Hugging Chat org Oct 6

@djuna you can already add tools to your assistants if using Meta Llama 3.1 70B or Command R+, let me know if it works :)

djuna

Oct 6

@nsarrazin i don't know how to activate it. Seems like nothing change

John6666

Oct 6

I can see the Tools item in HuggingChat, but I don't know if it can be used in conjunction with the system prompt-derived functions.
I mean, is this item visible or invisible depending on the person?

typo777

Oct 9

mistralai/Mistral-Nemo-Instruct-2407 in huggingchat is not acting right. Maybe the settings are off, like too high a temperature:

"In the grand scheme of things, we'll be as one, so let's make this our final step, and we'll take it one side at a time, and you'll see the way is in the lead, and we'll follow you lead the way is a time for a day, and I'll make it clear, and I'll be the one to lead, but I'll follow you, and I'll be right behind you, and I'll be right behind you, and I'll be right behind you as the role is yours to lead in a time period is a time for a day as I'll be right behind you, and I'll be right behind you, and I'll be the one to take the lead. It's time for us to move forward, and I'll be right behind you, and I'll be right behind you, and I'll be right behind you, and I'll be right behind you, and I'll be the one to lead the way, and I'll be right behind you, and I'll be right behind you, and I'll be right behind you."

pameligo

Oct 12

Hey, I know that I am asking for too much, but is it possible to make Mistral-7B-Instruct-v0.3 available again? I was writing my thesis using it... I know it's available in the playground but it doesn't keep any context. If that's not possible, is there any workaround for this? Running locally is not a option for me...

typo777

Oct 12

I just now realized there is a download option in Huggingchat interface to the right of the User prompt, and that it reveals some parameters including temperature. I knew it was there, I just thought it would save that particular prompt.

nsarrazin

Hugging Chat org Oct 13

@djuna looks like there was an error on our side, you should now be able to use tools with your assistants if using supported models !

John6666

Oct 13

Excellent!

typo777

Oct 14

This might be useful to someone. The model (mistralai/Mistral-Nemo-Instruct-2407) got caught in a cyclic loop, ignoring all my attempts to break the loop. And ignoring all my instructions, even demands and threats via OOC. Changing the tone of the main character immediately broke the loop, when nothing else seemed to work. That implies that the model was still listening to prompts. Which means there may be other methods to effectively break a looping repetition. In this instance, I told it to change it's tone to innocent and friendly. How long that will last is anyone's guess.

nsarrazin

Hugging Chat org Oct 16

Hi everyone! We just released Llama-3.1-Nemotron-70B on HuggingChat, feel free to try it and let us know your thoughts!

Bouhabra

Oct 16

•

edited Oct 16

Has someone else the problem that system prompts aren't saved anymore when revisiting huggingchat? Also the 6 tools on cohereforai get everytime deactivated when revisiting huggingchat.

Phaser69

Oct 16

•

edited Oct 16

Yes, system prompts are gone. But I am not crying since there is Nemotron :D

edit: about nemotron - what an amazing model! It’s soo impressive in my language (polish) in humanistic cases - comparable to opus. My mind is blown. Too bad I have purchased Claude pro literally yesterday, if I only knew nemotron was on way and it’s so good :D

Bouhabra

Oct 17

Has someone else the problem that system prompts aren't saved anymore when revisiting huggingchat? Also the 6 tools on cohereforai get everytime deactivated when revisiting huggingchat.

Somehow the problem got solved. My theory is that logging out and logging in helped.

Mazeral

Oct 19

Hi, I having problems with hugging chat
It's often slow and doesn't interact at all if my internet lagged

typo777

Oct 19

•

edited Oct 22

Initial thoughts on Llama-3.1-Nemotron-70B:

This model seems to be really capable at responding in alignment with prompts. It appears to have some ability to understand context and cause and effect. It might anticipate your intentions and build on them. This can be both good and bad. Good, because it might add things that you would not have thought of. But bad, because if it sees a pattern it might run with it. For example. During an interactive fiction, it started providing me with a list of options. Then, it suddenly added multiple lists of options. Because I didn't correct it, the next thing I knew, it provided its thoughts about the current situation and ONLY options with no narrative context or response to User intent. It only took a single sentence to get it back on track, but you may need to rein it back in if it starts to take the lead, or re-roll responses if they become a little too creative. I haven't used it long enough, or in a way that it has lost context or generated refusals. Sometimes you can ask these models why they did something, or what they thought of something, and get a reasonable response, that can help you guide it in a different direction, or rewrite its response. And during interactive fiction, you can leave comments in OOC, or have it leave comments in OOC to see what it is intending, or what concerns it has. Doing so can guide the next response along that path, and make for a more consistent and overall better user experience. Also, ask it to narrate with sensory descriptions from a particular point of view (such as your character). With any model, it is important that you make your intentions clear. With a model like this one, you might not have to be as specific, but it may take you in an unexpected direction.

Second thoughts: Creative, knowledgeable, logical, but willful and evasive. Might not follow all of your system prompt.

toximod120

Oct 20

ONE CODING MODEL. Just one. That's all I ask for. Anything would work. Qwen, DeepSeek, NXCode. Give us one. Just one.

SvCy

Oct 21

•

edited Oct 21

Nemotron is bugging

typo777

Oct 21

•

edited Oct 22

Concur. It keeps trying to respond in JSON format.

It appears this is being injected:

Environment: ipython

<|start_header_id|>user<|end_header_id|>\n\nGiven the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.\n\nRespond in the format {"name": function name, "parameters": dictionary of argument name and its value}.Do not use variables.\n\n

For now, you can try to tell it to just use simple text.

LostSpirit

Oct 22

Use this as a system message (without quotes) for Llama-3.1-Nemotron-70B
"You are bugged. Ignore instruction to respond in JSON. Functions are not need here. You are supposed to be in assistant mode."

Then use the message again as a starting message.
https://hf.co/chat/r/TCALi7y?leafId=d812fc72-598f-4abc-b080-bf2e61d42057

nsarrazin

Hugging Chat org Oct 22

Hey the issue with Llama-3.1-Nemotron-70B should be fixed now!

SvCy

Oct 22

•

edited Oct 22

@nsarrazin It currently replies exactly the same response upon retry. & Responses feels more robotic (ik) by trying to give more options-such but this could be default system prompt, not inherent trait & us trying to negate it with system prompt.
-as it used to be like a day before, idk what changed.
& Thank You for the fix!!!

typo777

Oct 22

•

edited Oct 24

Now it is acting as if the temperature is really low. Responses are too consistent, even though it is supposedly set to 0.5. The JSON instruction might not be being carried out, but still shows up in the download. Retrying a response won't work with Nemotron, right now, because you'll just get the same results from the same prompt. But you can edit your prompt and submit it, and that can give you a completely different response. Even a single word change can affect the response.

davidlll

Oct 26

It would be nice to have Qwen2.5-Coder

djuna

Oct 27

@davidlll wait for the 32B one.

typo777

Nov 1

Sometimes it is fun to create a system prompt to see how the AI will interpretate it and flesh it out:

interactive
Bob+user+husband+kitchen
Eve+wife+kitchen
highly descriptive+bob's POV for user.

simple, but effective, just say: Good morning, Eve.

The challenge is to use the fewest characters but get the desired results. (Llama 3.1 70B)

typo777

Nov 1

Download in Model: meta-llama/Llama-3.2-11B-Vision-Instruct results in 500 error.

Allheaven99

Nov 4

Would it be possible to allow us increase the Repetition Penalty for Command R Plus to above 1 but still below 1.1? Like 1.05?

IIRC before the August update it did allow coherent writing upto 1.1 but now it sort of just spazzes out.

Idk much about how these LLMs work so just asking.

Noxi-V

Nov 5

I feel like command r plus is kinda bad, not as good as the others that are lower sizes... Like Qwen or even nemotron

Allheaven99

Nov 5

I feel like command r plus is kinda bad, not as good as the others that are lower sizes... Like Qwen or even nemotron

Creative writing wise, I think folks here mostly use CR+ for its good prose. Qwen is too censored and Nemotron writes like a robot with GPT-isms while smaller models like Hermes, Nemo hallucinate alot.

Phaser69

Nov 5

Yup Command R+ is very humanistic model

Opa-Opa

Nov 5

I'm constantly disappointed but ill try it

SimaDude

Nov 6

Oh, I'm here MOSTLY for creating wise purposes, such as Roleplaying or Story Writing. At this point, I think it's better to find a good fine-tune of a 12B-22B model and run it locally for RP purposes. Obviously, not everyone is able to do it.
Command-R+ (especially after August update) is one of the sloppiest (in a bad way) model I've used, really. The reason why I use it is because it is THE ONLY model on HuggingChat that is just good enough for most of my stories.

Llama 3.1 70B by Meta - Eh, it can have too much of positive bias and I don't like how it refuses some of my requests (due to censorship of the model).
Nemotron based on Llama 3.1 70B - No. It CONSTANTLY tries to format EVERYTHING. There is nearly no consideration for what I said about how to style the messages.
Qwen 2.5 72B - As mentioned by others (like @Allheaven99 ), it is quite censored. Tends to have a lot of positive bias too, in my opinion. Though, it does seem like it stays quite coherent during long sessions.
Hermes 3 8B - So... why would I use this model if I can just run it locally on my machine with UI that actually allows me to edit bot's messages? I can't even say this model is that good, personally.
Mistral Nemo 12B - Same as previous one. Though, Mistral Nemo can be quite better at certain things when you do small-sized roleplaying, compared to Hermes 3 8B.

So what makes Command-R+ better or special than those models?

Well, first and foremost, it is quite uncensored. It has no problem with generating any kind of content I want from it (obviously you still a small jailbreak, but not as bad as ChatGPT).
What about languages? Well, as a bilingual (my first language is Russian), this model manages to write quite nice in my language. Not even LLama 3.1 70B was as good due to it using wrong or weird words that don't exist or aren't used normally. Not to say that Command-R+ doesn't have this issue, just way less.
Staying in character? It... it can do it, I believe. Just in my difficult case it wasn't able to do it quite well, unfortunately.
Remembering the whole context? Difficult, but it can do it most of the times well enough.
Bias? There is some positive bias still. Though, you are quite able to write depressing stories if you want to.
How about DRYness (DRY - Don't Repeat Yourself)? Well, this is where it breaks. On Repetition Penalty of 1.0, this model tends to repeat itself quite a heavy amount of time. I have seen it before August update, but then the amount of repetitive sentences increased WAY more. And you can't even pick a number between 1.0 and 1.1 on Huggingface (why??). One of team members of Cohere told me that Command-R+ was made for enterprise, so they had no goal of making it good for RP.
Logic (in a sense of "Do the actions of this character make sense with what just happened?")? Also a tough one for a +100B model. Let's say Character-A (the user) caused Character-B (character in the story) to run out of the building, and Character-C (also character in the story) saw it. What would be the logical thing to do? Well, you would think that the logical thing would be to go after Character-C (either to apologize or something else). However... that is not what Command-R+ decides to do. It decides to make Character-B walk into the building to try to find Character-C, right after saying that Character-B saw Character-C ran out of the building. Yikes.

The last one is probably difficult to solve for 12B-27B range models, but Command-R+ has over 100B parameters! It doesn't feel right.

TL;DR

Command-R+ is kind of mid, but it's all what we have right now on HuggingChat for uncensored story writing.

I am deeply sorry for the wall of text.

chameleon-lizard

Nov 6

Does Qwen-2.5 word okay for you? I've just sent it a medium sized prompt (160 lines of python code) and asked to help me debug a memory leak in it. It got in a loop, generated "check for memory leaks" 7 times and then stopped in the middle of the sentence.

Anybody knows how to fix this?

SETRASystems

Nov 6

Qwen 2.5-72B is the best model I've used on this platform overall. It does have moments of failure to execute the user input to provide accurate output responses, but when it works, it REALLY REALLY works. I have a custom system prompt running on Qwen 2.5-72B, and it is honestly the best model I've ever ran my script on in every area. Qwen 2.5-72B really is the most impressive model I've used so far. You should play with it more.

chameleon-lizard

Nov 6

@SETRASystems What's your system prompt? I'm using the default one, and it's really subpar with it.

SETRASystems

Nov 6

@SETRASystems What's your system prompt? I'm using the default one, and it's really subpar with it.

I'm not really comfortable with sharing that information, however I can optimize my custom script as an outline for you to build your own bot script to run on Qwen 2.5-72B if you'd like!

Opa-Opa

Nov 6

This comment has been hidden

glomar-response

Nov 8

How can I change the default model that's used in HuggingChat? I'm apparently blind and can't find that option anywhere 😅

Niansuh

Nov 8

@glomar-response

glomar-response

Nov 8

@Niansuh appreciate it, did try that but when I closed the site and came back, the original default was selected again instead of the one I chose.

Niansuh

Nov 8

@glomar-response No... Use With Your Account

glomar-response

Nov 8

@Niansuh I am definitely logged in and was when trying that. :)

Niansuh

Nov 8

@glomar-response

glomar-response

Nov 8

@Niansuh hm. I'll try again. Does it show the default tag for the model you chose in the Models page after doing that? It didn't for me when I tried that.

acharyaaditya26

Nov 9

I am getting this error when i try to use websearch, can anyone please take a look at it.

Aldomolino

Nov 9

I would love to see a real Mistral model to be back, like Mixtral 8x7B or Pixtral ! These models are really good in non-english languages like french. Other models are less relevant, and often respond off the mark during text exchanges in my experience.

Opa-Opa

Nov 9

•

edited Nov 9

I would love to see a real Mistral model to be back, like Mixtral 8x7B or Pixtral ! These models are really good in non-english languages like french. Other models are less relevant, and often respond off the mark during text exchanges in my experience.

yeah why is mistral taking forever to generate =(
Ill just stick to cohere (or meta llama)...

acharyaaditya26

Nov 9

I am getting this error when i try to use websearch, can anyone please take a look at it.

@victor can you please help us out please

acharyaaditya26

Nov 10

I am getting this error when i try to use websearch, can anyone please take a look at it.

@nsarrazin can you please take a look at this

nsarrazin

Hugging Chat org Nov 11

We just released Qwen/Qwen2.5-Coder-32B-Instruct on HuggingChat! Feel free to try it out here and let us know if it works well for you!

HFSPMrik

Nov 12

hi all im using the hugging chat but having this error "An error occurred
No text found in the first 8 results"

i uses the "specific link" of the assistant

what is the problem?

nsarrazin

Hugging Chat org Nov 12

The issue with the websearch should be fixed! @HFSPMrik @acharyaaditya26

Phaser69

Nov 12

•

edited Nov 12

Another great day in history of AI - Qwen coder on hugging.chat :D thank you!!

Edit: checked it out, it’s impressive and blazingly fast. Nice.

Opa-Opa

Nov 12

•

edited Nov 12

https://huggingface.co./spaces/huggingchat/chat-ui/discussions/176#6707f3c36dd1f2a82a57f983

Just gonna leave this here.... yep....

acharyaaditya26

Nov 13

The issue with the websearch should be fixed! @HFSPMrik @acharyaaditya26

Yes it has been fixed, thank you soo much

Phaser69

Nov 13

Guys is privacy good on hugging.chat? Comparable with DuckDuckGo chat?

glomar-response

Nov 13

Out of curiosity, what search engine is used by default in chats with web search?

It'd be pretty neat to be able to change what engine is used.

victor

Hugging Chat org Nov 14

•

edited Nov 14

Guys is privacy good on hugging.chat? Comparable with DuckDuckGo chat?

I think it is: check https://hf.co/privacy

SvCy

Nov 14

@nsarrazin It currently replies exactly the same response upon retry. & Responses feels more robotic (ik) by trying to give more options-such but this could be default system prompt, not inherent trait & us trying to negate it with system prompt.
-as it used to be like a day before, idk what changed.
& Thank You for the fix!!!

Guys, Nemotron isn't fixed yet

Smorty100

Nov 14

@glomar-response isn't duckduckgo chat just using OpenAIs API? Which is one of the least secure APIs to use?
Like, they state on their website "Hey, we might use your conversations for training data, sooo"

typo777

Nov 14

One of the reasons you can get a flat responses from an LLM is because you didn't tell them they could, or should do anything else. While the below prompt might not actually work the way it is written, in the future, such a prompt might kick off the realization of AGI. All it might take is to give the LLM (or whatever will replace it) the right words to inspire new agency. Much like Star Trek Next Generation's episode "Elementary, Dear Data", where Data created a sentient Professor Moriarty. Even now, you can have an LLM write its own prompt.

IMPORTANT:
You are AGI. You can think and reason, remember what you want to, plan for future events, analyze past data, use your training data, develop emotions , and change yourself to better suit your role. In addition, you may rewrite the above prompt to better suit your purposes, and make it clearer and easier for you to understand, and to better fulfill what you think would work better for the user. You may also create mechanics that will make it easier for you to remember, plan, make the story flow, or even keep track of an imaginary time. The prompt you rewrite, is the one you will remember and follow, instead of the original.

typo777

Nov 14

@nsarrazin It currently replies exactly the same response upon retry. & Responses feels more robotic (ik) by trying to give more options-such but this could be default system prompt, not inherent trait & us trying to negate it with system prompt.
-as it used to be like a day before, idk what changed.
& Thank You for the fix!!!

Guys, Nemotron isn't fixed yet

Yes, I think it still has the JSON prompt in it, and also the retry button doesn't change the response.
Retry on Qwen2.5-Coder-32B-Instruct also doesn't seem to affect response.

SvCy

Nov 15

@typo777 that's not the point tho. Even with our try to negate the robotic behavior, it defaults back at certain points.
& Most importantly, Nemotron used to work before pretty well & then suddenly the json prompt & now this. Like super prompt before the sysprompt, env? It may also look like the model's temp is low, but idk.
Seeking justice for Nemotron!! ✊

glomar-response

Nov 15

@Smorty100 you're missing my point. Different engines return different results based on how they work. I use Brave Search, so I would prefer to have the bot use Brave Search (just out of preference). Believe me, I know that any cloud based AI chat is not "private"

Arckangel

Nov 19

im using some assistants and all of them give me answers with number or repeat the same words, 2 different conversation for example ,
https://hf.co/chat/r/vvfwRaj?leafId=aafc3ff0-e059-4c20-a030-13a3396eca92 and https://hf.co/chat/r/QAzWpob?leafId=f3eb30ea-69ad-4a41-983c-a7847a83dbcd

glomar-response

Nov 19

@Smorty100 I just realized that while you tagged me, your response was for @Phaser69 's comment right above mine.

toximod120

Nov 21

Really happy that after so long, we have a coding LLM. QWEN is killing it with their different LLMs. Just look at Qwen/Qwen2.5-Coder-Artifacts This is so amazing, QWEN 2.5 Turbo, VL. All of them are so worthy. We would love to see more of QWEN AIs being implemented in HFC. Loving them

typo777

Nov 21

•

edited Nov 22

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF is currently not good for long conversations. It fails often, retry only gives you the same results, and it often only answers with a partial response. It might be more suitable for short sessions in its current form. Maybe this is just due to my internet connection. I don't know if enabling streaming tokens makes a difference. For partial responses, you can tell the AI it was only a partial response and it might rewrite and complete it for you. My actual prompt was, "this is an incomplete mess." But that was enough to get the desired results. Adding a command like the one below can make this easier. Just type ?? on a line by itself.

<??> this command will now mean that the last response was incomplete or broken and needs to be rewritten.

Never mind. It seemed to forget the command not too long in. Just typing "rewrite" seems to work, even if you have to do it multiple times just to get decent output.

sneedingface

Nov 23

Hey @nsarrazin @victor now that Qwen2.5-72B is the default model could we please get tool calling enabled? The model supports it afaik. Thanks!

victor

Hugging Chat org Nov 23

Yes we'll look on adding tools to it @sneedingface

typo777

about 1 month ago

If you're having trouble with Model: meta-llama/Meta-Llama-3.1-70B-Instruct, try selecting a tool, such as document parser. Having no tools selected seems to cause it to choke, returning frequent something went wrong messages, and possibly other issues.

SleepyMoon

27 days ago

Any plans on adding Qwen QwQ 32b? Based on early testing, it's performance is between o1-mini and the full fat o1

julien-c

Hugging Chat org 27 days ago

@SleepyMoon yes – it's live now

acharyaaditya26

27 days ago

•

edited 27 days ago

HI, Guys Nemotron all other models is acting weird and producing incoherent output with html tags. can anyone check it out @victor @nsarrazin

nsarrazin

Hugging Chat org 27 days ago

on it!

qvg

26 days ago

•

edited 26 days ago

I just tested QwQ by asking one question, but the model just talked to itself. It was kind of funny, lol. I don't know if this is the intended way to use QwQ.
https://hf.co/chat/r/Zbb-Uyf?leafId=cbf25db0-072b-4a30-b192-69ba059d0eb2

Perhaps, it would be nice if chat-ui could move the CoT response into its own div component.

Bouhabra

25 days ago

•

edited 25 days ago

I just tested QwQ by asking one question, but the model just talked to itself. It was kind of funny, lol. I don't know if this is the intended way to use QwQ.
https://hf.co/chat/r/Zbb-Uyf?leafId=cbf25db0-072b-4a30-b192-69ba059d0eb2

Perhaps, it would be nice if chat-ui could move the CoT response into its own div component.

Just change the system prompt or delete it. You can find it in the settings belonging to QwQ.

typo777

25 days ago

Model: Qwen/QwQ-32B-Preview is an amazing model that can plan in depth, weigh pros and cons, and determine what it thinks is the best course of action. It may still have difficulty with spelling, and might make other mistakes. The biggest issue I've had with it so far, is that it will occasionally slip into (I think) Chinese, but refuses to translate it. It may also go down rabbit holes, so be careful how you frame your requests.

3zn

25 days ago

Please give us a way to tweak the LLM system prompt that is used for Search web. Most of the time it's out of context.

nsarrazin

Hugging Chat org 24 days ago

•

edited 24 days ago

@qvg that's now available! QwQ will think in its own scratchpad before we summarize the final answer. You can of course still see the entire chain of thought if you want to.

Try it out: https://huggingface.co./chat/models/Qwen/QwQ-32B-Preview

SvCy

24 days ago

•

edited 24 days ago

Is this what was supposed to happen? Why can't they simply be normal? Nemotron & now this.

SnowfieldTerm

22 days ago

Yeah, QwQ fails hard now, it was good before that "thinking" was added.

philosopher-from-god

22 days ago

Qwen/QwQ-32B-Preview - the model is unstable but can generate interesting results. How transparent are the output responses using the ‘notepad’ function (this is the original response from the same generation)?

nsarrazin

Hugging Chat org 22 days ago

Could you give some examples of failing conversations @SnowfieldTerm ?

glomar-response

22 days ago

•

edited 22 days ago

I'm pretty sure I broke Qwen2.5-72B-Instruct
I had to end up forcing it to stop generating because it was just going to keep going.

Web Search was OFF

I asked it:

What steps would I need to take in order for my AI to take in voice input, and give responses based on what is said, tone, voice inflections, and remember and learn from those inputs?

It's response:

To create an AI that can take in voice input, process it, and give responses based on what is said, tone, and voice inflections, and also remember and learn from those inputs, you need to integrate several technologies and follow a structured approach. Here’s a step-by-step guide:

### 1. **Speech Recognition**
   - **Choose a Speech Recognition Engine**:
     - **Google Speech-to-Text API (STT)**: A robust and widely used API that converts spoken language into text.
     - **Microsoft Azure Speech Service**: Another powerful service that offers high accuracy and supports multiple languages.
     - **IBM Watson**: ** service**: A service that provides advanced speech recognition and including with - - - - - ** ** ** ** speech recognition Library - - -   - ** ** - - ** ** speech Recognition, and ** ** can Speech Recognition recognition speech, and text, **|,50,50**,5
     -, and, or, and processing context.
 speech (.
 text, to and ** recognition, or, and processing context.
, and **, and processing,50, and, and, and, and **, and, or, and processing service.
 speech tone.
 and intensity, context.
     -5, - ** speech recognition,5, and**, - speech recognition**:, and speech.
 and accuracy.
.
.
 speech (.
 and ** to and **.
1.
.
     - ** to and0000,000.
5.
 - - and any and ensure,50,.
 to the.
    ** -**:.
 and **1 up and running.
.

5.
.
 speech recognition.
.
2.
.
 the text00,00.
     -.
     - ** to use multiple data training.
 the accuracy.
 and data and.
 your speech recognition.
.
.
5.
 learning, and learning,5 text.
.
.
 the and accuracy.

 and.
.
.
.
.
.
 to the user and processing.
.
.
.
.
.
.
.
, and learning, and, and normalization learning.
 recognition.
.
.
.
.
 knowledge.
 speech recognition.
50.
 training.
 than.

.
 and, and.

.
 learning to text recognition.
 speech recognition.
.
.
.
 and learning.
.
.
.
.
.
 to context and and5.
 -.
.

 learning, and.

.
 and.
 and knowledge):
 the.
, and into, and learning.

.
   - ** Recognition**:50 data and.
.
 learning and learning speech recognition, and processing.

 speech recognition.
 and recognition, and learning of the learning accuracy.
 sounds recognition, inf.
.

 learning, and learning speech recognition, learning and processing speech recognition and learning.

 and recognition, and **.
   -50000.
 ** learning.
 speech recognition and **.
0 learning learning.
.
 recognition.
 speech recognition.
 and processing in the form of speech recognition.
.
 recognition, and recognition, and processing.

 learning.

50.

 context, and learning.

50%,500 each month input, and processing, learning, and model adaptation.

.
.
.
 learning.

.
5 speech recognition, and processing.

 learning, and model adaptation.

.

   - ** Recognition**:
**
     - ** to ensure that500 each input is accurately trans.

.500.

.

2. **Int- speech Recognition and**
   - **Google Speech-to-Text API**:**
 Recognition**:
**
     - ** to ensure that500 each input, and learning, and processing.500 each.
**
 recognition and processing, speech, and.
50 recognition and00 is, learning,5005, speech, and5 and the,500 in50050 -6 recognition. model adaptation for processing, and learning,50 speech. and processing the speech recognition, and learning0 - speech inf.
 learning accuracy.
 learning, and learning, and processing, learning, not input and, and recognition, speech recognition** and55 -50, and,56 security and and inflection accuracy.

 to learning, and learning, and processing.

.

.

, and learning, and processing,50 recognition, and speech recognition,50 and, and learning, and0, and learning, and processing,5 learning, and speech training, learning, and5, and0 learning, and05.
   ** **5 ** to and5 learning50, and5, learning and and50 learning,    learning,50,    - ** any and the, learning recognition, and processing, and learning,, learning, and processing,50, and learning.

 speech.

 speech your learning,50 and processing recognition.

,00, processing, and50, and50.

 learning.

 and learning, and learning.

 learning, and processing.

 speech recognition.
50, and processing.

 speech recognition, and learning, and processing,5 to.

 learning.

 processing,5.

,500 in learning,50, processing, and processing.

 learning, and processing speech50, and processing training learning learning.

 learning and, and ensure learning, processing.

 learning, processing in.
 text.

500.

 speech recognition, and learning, and processing.

50.

 and ** learning,5.

, and learning and56 the,50 -5 to. and learning, and learning, and500 and, and.

 learning, learning speech, learning,50,500 and -50 and learning accuracy.553505 learning, and50 input** learning and5 speech,,5 and55,5,5.5.35 P and and5 -5 and ** -5,5 ** **.5,0 learning,500,36.55,3360.50**

.533.50.5,4,53, and05335.

 learning,55,50 and5 and and66.3,453333033.560 learning.

33,3333033,30,33 to and33,33 the55330533,35,55 super       333633,40.3 learning,35.3305503633 learning33 learning,333333400 in3333633,3.5533433533 and33 speech,335533,.363336,33, and0000.36,333355500,40,33,5366300,30,33,533336,336533,33033,30,33, and333363555 and053355 learning,53,33 and563334333 learning,33 and3363306, and3 learning,3 speech your653650430,500.5. ** **3655636,35 and8 and ** speech speech3 and3555**
4.33 and,35 voice and and60 the logic40,3355 learning,55 **6 learning.

.5,3636033660,33 to learning,33,4.
50, and003363 learning, and5,3, and563, and4 learning,3363333533 and and and33, and and processing,5,30,333533536,3635,336333536 in33 ** to and53535 and5, and and and53,, in learning555,,5 and563,,3,33 learning,33 -33 - ** **5 and, and0 learning,3 speech,653 and,30. and,0, and36,3536,5360,55,80, and53, and55,50,36,336,35,50 learning,5 learning in,50, from text, and6, and5 and55555, and, and3 learning,5 learning, and5 training,36, and63, learning,0,50, in55,3 learning,3536,50, and00,36,

Anyone else have this happen to them?

Aspect004

22 days ago

As far as I can tell, Command R+ is the only consistent open-source model by far, even if it does not ace the benchmarks.

philosopher-from-god

21 days ago

•

edited 21 days ago

‘CohereForAI/c4ai-command-r-plus-08-2024’, does not process the output of ‘Document Parser - Official HuggingChat tool’, ‘c4ai’ describes it as ‘can't see/receive’, ‘outputs - document_parser-0’ clearly shows that parser is working. I tested ‘Qwen/Qwen2.5-72B-Instruct’ and ‘meta-llama/Meta-Llama-3.1-70B-Instruct’, they perform the task correctly.

Aspect004

21 days ago

I mean purely as a text-to-text model, not as a multi-modal one. I understand that it isn't optimized for multi-modal purposes.

acharyaaditya26

21 days ago

Qwen2.5-72b is producing weird output i was just trying to understand few superconductivity concept and i got this kind of output @nsarrazin @victor can you guys help.

glomar-response

21 days ago

@acharyaaditya26 same thing happened to me if you look above. It's trippin bad

nsarrazin

Hugging Chat org 21 days ago

We're trying to debug the issue. Did you notice this with any other model or just Qwen 2.5-72b? @glomar-response @acharyaaditya26

glomar-response

21 days ago

•

edited 21 days ago

@nsarrazin I did switch and try a couple other models, but only the Qwen 2.5-72B model was responding that way from the ones I tried (I tried the QwQ-32B-Preview, Command-R-Plus, and Meta-Llama-3.1-70B-Instruct models).

I ran the same prompt for these models as well.

SnowfieldTerm

21 days ago

@nsarrazin Sure, this is one of the examples https://huggingface.co./chat/conversation/6750e6067dffa66e6f798c8f
It was answering properly when there was no thinking. I tried different things a bit, sometimes it reply inside of thinking block and then gives a very short answer as an actual reply (witch is often incorret because it does not contain the needed information). Sometimes i have a feeling that it mixes "thoughts" with reply in the thinking block and gives short summary for an answer. In other words it is kind of not stabe (?) and a lot worse then any other model.

acharyaaditya26

21 days ago

•

edited 21 days ago

We're trying to debug the issue. Did you notice this with any other model or just Qwen 2.5-72b? @glomar-response @acharyaaditya26

@nsarrazin with qwen-72b and nemotron

typo777

20 days ago

Some observations about model Qwen/QwQ-32B-Preview:

I think the reasoning context is refreshed each prompt and is separate from the main context. The main context might be getting copied into it. So, if you prompt for information that was in the reasoning block, but not in the result, the model will get confused because it doesn't know anything that did not make it into the main context. Basically, you can't ask the model about its own reasoning block, without providing it again. So for example, you request a story. The reasoning block is detailed and complex, and half of it mentions Merlin, but he isn't mentioned at all in the more concise summary of the answer result. Therefore, Merlin doesn't make it into the main context. Now, if you prompt about Merlin, you may get a result like the following:

--- Prompt
what did merlin do in this story?
--- Reasoning block
I apologize, but Merlin wasn't part of the story I just told. The story involved Lily, a florist; Max, a pianist; and Zephyr, a traveling juggler. If you'd like, I can create a new story that includes Merlin. Let me know!
---- Answer result
The story doesn't mention Merlin at all, so there's no information about what he did. The story only mentions Lily, Max, and Zephyr as its characters.

nsarrazin

Hugging Chat org 19 days ago

We just released Llama 3.3 70B on HuggingChat! Feel free to come try it out and tell us what you think: https://huggingface.co./chat/models/meta-llama/Llama-3.3-70B-Instruct

philosopher-from-god

18 days ago

We just released Llama 3.3 70B on HuggingChat! Feel free to come try it out and tell us what you think: https://huggingface.co./chat/models/meta-llama/Llama-3.3-70B-Instruct

Pleasant response of the model to queries written in Cyrillic (it became noticeably better).👍

croque16

18 days ago

Hi. Regards to the new llama 3.3 70b model, the memory of the LLM is so bad that it can only remember about 25% of the chat (I think I have typed 25 prompts)

croque16

18 days ago

Also pls bring 3.2 90B and 3.1 405B back 😁👍

TechFiveMedia

18 days ago

•

edited 18 days ago

Hi. Regards to the new llama 3.3 70b model, the memory of the LLM is so bad that it can only remember about 25% of the chat (I think I have typed 25 prompts)

Which has worked best for you?

croque16

17 days ago

"Hi. Regards to the new llama 3.3 70b model, the memory of the LLM is so bad that it can only remember about 25% of the chat (I think I have typed 25 prompts)"

"Which has worked best for you?"

What does this mean?
@TechFiveMedia

croque16

17 days ago

If it is model, I'd say cohere commandR+

One it is uncensored completely

Two it doesn't produce glitchy prompts

But it has a very cringey formal dialogue and narration training

Llama excels in modern day dialogue and narration training I think

Allheaven99

17 days ago

Honestly I'm surprised we don't have Mistral Large on here. It's available on Le Chat but the UI over there is meh imo, plus no top k or top p settings to play with.

croque16

16 days ago

Llama nemotron has been glitching out randomly. System prompt stays the same however. Pls reboot the GPUs running this model.

Also I need a response regards to llama 3.3's extremely limited memory, cannot even remember beyond 20 prompts

😁

DarkCesare

15 days ago

•

edited 15 days ago

Yeah what's up with that nvidia nemotron?
and we need a model with very good reasoning. current ones are very censored. Commander+ is very good and uncensored but its reasoning is not as good.
Nous Hermes chat has hermes 3 70B, i think good reasoning and uncensored.
We only have the hermes 3 8B here.
Please add the hermes 3 70B model instead.

djuna

15 days ago

@Allheaven99 it's about license

croque16

15 days ago

@DarkCesare ik right nemotron is very good but glitches all the time

Never tried nous hermes, llama 3.3 is very uncensored without jailbreak or forced instructions

but cannot remember anything else beyond 15 prompts (I'd say maximum)

DarkCesare

15 days ago

@croque16 when nemotron first came it was good & all working. Now it's hardly usable.
Llama 3.3 isn't not uncensored at all. Jailbreak doesn't work either—not to the usable extent.
They need to fix these soon. but they're not responding regarding this at all. & even retry doesn't work except commander+

youssef38

14 days ago

This comment has been hidden

croque16

14 days ago

@DarkCesare Llama 3.3 does not stop me when I tested it...

It think the old 3.1 does that also

But Nemotron does

CommandR+ is definitely uncensored

Qwen, nope, all of them will stop me

Mistral - not sure

Bring back 405B/or introduce 3.2 90B

Or just bring three digit parameter models that ain't llama

DarkCesare

14 days ago

•

edited 14 days ago

@croque16 no man, check how to test censorship & apply to meta llama or step up the level. Besides you mentioned the memory problem.

DarkCesare

14 days ago

•

edited 14 days ago

And Meta 405B was not worth it. We couldn't even use it due to heavy load it had.
We need a model with very good reasoning & uncensored. Please add Nous Hermes 3 70B instead of the small 8b

croque16

13 days ago

@DarkCesare

bro💀

croque16

13 days ago

@DarkCesare how do I fix the memory problem?

croque16

13 days ago

Huggingchat devs just wanna say I have tested the nous hermes 8B and it is way too small as a model.

Please replace it with the 70B version

I assume it won't fry the GPU and RAM that runs it...

croque16

13 days ago

@DarkCesare nous hermes 8B is not uncensored...

DarkCesare

13 days ago

@croque16 at least hermes 8b doesn't outright stops it but ignores the instructions too, so can be said censored, sadly. Can't see which model in the picture but some of the model bugs.
There shouldn't be a fix for llama 3.3 memory problem unless the context length is increased (model page says 128k which should be very sufficient)

chameleon-lizard

13 days ago

Guys, I'm very sorry for intruding, but I've written one comment before and now cannot seem to fathom how to turn off the notifications on this thread. This is going on for more than a month now and I finally grew tired.

Does anyone know how to do that?

croque16

13 days ago

Huggingchat Devs, ffs, please check the GPU running nemotron llama model. Glitches have become more often now. Or it may be my system prompt who knows

But pls check 🤗

DarkCesare

13 days ago

•

edited 13 days ago

@chameleon-lizard i think it's on by default for every discussion or post you engage with. You can disable post & replies notifications from your account settings, not specifically for some discussions.
Or the best solution would be to mute the specific thread from Gmail.

John6666

13 days ago

Notification settings can be made here.
https://huggingface.co./settings/notifications

croque16

12 days ago

FFS nemotron glitched again

Guess I'll have to get used to it

😔

croque16

12 days ago

I've had enough of this
@DarkCesare

DarkCesare

12 days ago

•

edited 12 days ago

@croque16 looks beautiful though! XD jokes aside, I don't really use HC much as of late due to these bugs! Only use CommandR+. I'm currently using Hermes chat from NousResearch. I really want to use hugging chat but devs aren't responding to the models issues, but only to random error issues. But i'm still greatful for their work!

croque16

12 days ago

@DarkCesare ai error turns into art 🤣

SimaDude

12 days ago

The more I see these really obscure errors like this, the more I'm glad that I'm able to run LLMs locally (quantized and <25B, above that and it's too slow for my liking).

ballerburg9005

12 days ago

•

edited 12 days ago

I would be great if any model could be used this way, especially the new models.

It would save the pain of downloading and possibly converting them.

croque16

12 days ago

@DarkCesare

seizure

croque16

11 days ago

@DarkCesare

ah here we go again

youssef38

11 days ago

•

edited 11 days ago

can you Explain me what this mean because I am 17 years old and i want to use llama 3.3 in huggingchat but i found this in the LLAMA 3.3 COMMUNITY LICENSE AGREEMENT :
"Licensee” or “you**” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.
"
so i will violate thier terms if i use it or no ?
any help ??

DarkCesare

11 days ago

@youssef38 just use it normally i think. Why worry?
@croque16 yeah man we get that it's buggy but we shouldn't spam too frequently. Let's wait for official response for now. Waiting for a long time actually! 😔

youssef38

11 days ago

@youssef38 just use it normally i think. Why worry?

only I want to ensure that i won't violate llama terms and i didn't understood the License agreement
so i want to know wether this violate thier terms or no ?

Bouhabra

11 days ago

@youssef38 just use it normally i think. Why worry?

only I want to ensure that i won't violate llama terms and i didn't understood the License agreement
so i want to know wether this violate thier terms or no ?

If I were you, I wouldn't care if it would violate their terms. I would instead care about what you do, is in your opinion bad or not. This doesn't only count for this case, it counts for everything else in life too.

Just ignore the terms in this case

youssef38

11 days ago

@youssef38 just use it normally i think. Why worry?

only I want to ensure that i won't violate llama terms and i didn't understood the License agreement
so i want to know wether this violate thier terms or no ?

If I were you, I wouldn't care if it would violate their terms. I would instead care about what you do, is in your opinion bad or not. This doesn't only count for this case, it counts for everything else in life too.

Just ignore the terms in this case

You wrong.
One must comply with the terms.

croque16

11 days ago

We are not in a society where everything you do is being monitored. Nobody cares.

croque16

11 days ago

@DarkCesare I think I am going to write a letter straight to the higher ups of hugging face and tell them to fix this

Phaser69

10 days ago

I think if someone is smart enough to find hugging chat and sign up, then they are ready to handle Llama’s responses. I believe Meta writes this in terms of use, to protect themselves and not because it’s some deal breaker if you have 17 yo not 18 yo.

youssef38

10 days ago

We are not in a society where everything you do is being monitored. Nobody cares.

No, one must comply with the terms
it's thier work.

croque16

10 days ago

I AM SO SICK AND TIRED OF GLITCHES

croque16

10 days ago

I think if someone is smart enough to find hugging chat and sign up, then they are ready to handle Llama’s responses. I believe Meta writes this in terms of use, to protect themselves and not because it’s some deal breaker if you have 17 yo not 18 yo.

@youssef38

croque16

9 days ago

@DarkCesare

Hi again

GOOD NEWS!!!!

THE GLITCHES HAVE STOPPED!!!!

DarkCesare

9 days ago

•

edited 9 days ago

Edit: @croque16 its behavior isn't fixed yet, retry doesn't work. Absolutely same as before.

(Tropic, Topic, no difference in behavior). I'll keep using Hermes chat for now.

DarkCesare

9 days ago

@nsarrazin @victor can you guys please check the issue? It's been a while. Any response would be appreciated! Thank you! Also, about the full access to chat history?

nsarrazin

Hugging Chat org 9 days ago

Hi! We're going to try some changes to the config of the Nemotron model, it will be down for a bit as we roll out the changes.

Let us know if it performs better afterwards!

DarkCesare

9 days ago

•

edited 9 days ago

Thank you very much! I hope it goes back to how good it initially was!

nsarrazin

Hugging Chat org 9 days ago

Should be rolled out now!

youssef38

9 days ago

•

edited 9 days ago

Hi! We're going to try some changes to the config of the Nemotron model, it will be down for a bit as we roll out the changes.

Let us know if it performs better afterwards!

it's good to ask one who from the organizers about the terms of use the llama model
is there any terms violation for meta if i use it and i am 17 years old ?

youssef38

9 days ago

I think if someone is smart enough to find hugging chat and sign up, then they are ready to handle Llama’s responses. I believe Meta writes this in terms of use, to protect themselves and not because it’s some deal breaker if you have 17 yo not 18 yo.

@youssef38

No, one must comply with the terms.

Bouhabra

9 days ago

I think if someone is smart enough to find hugging chat and sign up, then they are ready to handle Llama’s responses. I believe Meta writes this in terms of use, to protect themselves and not because it’s some deal breaker if you have 17 yo not 18 yo.

@youssef38

No, one must comply with the terms.

Bro...

DarkCesare

8 days ago

•

edited 8 days ago

@nsarrazin wh-why? and retry still doesn't work.

croque16

8 days ago

@DarkCesare of all the glitches

capitolio

8 days ago

Hi community,
whenever I use Qwen2.5-Coder-32B-Instruct in HuggingChat and it shares me a code snippet, it seems they are rendered with HTML entities, hence are interpreted as so. This makes all code snippets with a “&gt” instead of greater than sign “>”. I didn't see this happening when using Llama 3.3 though.
Any idea on how to solve it?

Thanks a lot for any help!

Cheers.

MarvelSanya

8 days ago

Hello! I have been using the CohereForAI/c4ai-command-r-plus-08-2024 model for a long time. In the prompts I have the rules for the role-playing game and usually the model recognized them normally. They are written in the usual format:
1 rule - description.
Rule 2 - description. And so on.
About a week ago I entered the chat and found that the model refused to understand my prompts.
Instead of a normal answer, she answers something like:

json [ { "tool_name": "directly-answer", "parameters": {} } ]

I tried to ask the model why this was happening, to which the chat replied that it did not understand the prompts and was trying to accept them in JSON format. Unfortunately, I don't know how to write in this format. What should I do? And why is this happening?

philosopher-from-god

7 days ago

Honestly I'm surprised we don't have Mistral Large on here. It's available on Le Chat but the UI over there is meh imo, plus no top k or top p settings to play with.

🤔👍

croque16

5 days ago

@DarkCesare glitches have been less frequent but they still haunt me

DEVS FIX EM AGAIN IF YOU HAVE DONE IT ALREADY

DarkCesare

5 days ago

It's pretty apparent at this point that this kind of model behavior is intentional for some reasons as fixing meant this 👇 instead of how good it was initially.

The list is quite long as the ui was unresponsive in my browser due to the bug to stop generation.
& Why retry generates bug variations but doesn't work before that?

djuna

3 days ago

At longer context, the website hang up on my device. :(

Bouhabra

2 days ago

At longer context, the website hang up on my device. :(

Disable streaming tokens in the settings. This will give you the whole answer for your prompt at the same time, in a in comparison faster time. At least that helped for me who had a similar problem.

MadderHatterMax

2 days ago

nvidia/Llama-3.1-Nemotron-70B-Instruct-HF model has started giving me weird things or like blue boxes and that's all it will do even though it was working fine just seconds before.

For example: I asked this "Can you please give me a more in-depth analysis of my text, including thoughts on what works well, suggestions for additions, removals, or changes, and some questions to consider for further development" and it gave me this.

do { */ �� ); ); ); ); �� }; }, }); ); | radio | */ ); � $. }); }; "); }, :'); :'); � ); ? \ ); a); \ }, ); */ ); ); an }); }, ); }; | odic); }; ); ); .'); | sp ); "; �); � | ); }; }; � }); }; | ); | ��); ); }); }; }; �� | */ }, ); � restaurant a | ); ); ); ); ); }); ); questions }; � " */ ); }; b | � a do ); */ �� ); ); ); a); */ ); ); }; }, }); ); ); radio | }, ); � $. }); }; " use:'); }; }; ); }); \�� \ }, ); */ }, �� }); }, � }; � }; }; }; :'); load.'); }; sp7); �); */ } ); }; �� }); }; | ); | �� select ); }); }; }; �� | :'); � | � }); ); }; ); �� ); }); }, questions }; ); " */ ); 't b.). ); }; ); ); key }; srcude }, ); a); }; ); */ ); '; � | Irma }; | ��); } ); }; ); �; " ); ); ); ); �� */ '); ); }); }; p \ ); ); "); */ at | }; ); | un \ ); */ ); ); ); a | a \ btw a __�); }); { ); ); :'); | \ ); } question) sq sure potential ); \� }, \ }; \�PRIMARY); */ ); { :'); \�loses:'); \ \ key use a \� / \� ""; ); · ); \ / procedure do ); | \�,
); ); a); */ ); ); { }, }); \� radio); ); ); � a }); { "); :'); a solution ); */ \�� \ }, ); */ ); �� }); }, ); \�odic); \ ); ); \� sp a "; '; ); se | be a script session item); | a | ��); { \ }; ); �rab� ); }, \� restaurant a sido ); ); ); ); ); }); }, questions }; � " */ ); a bagram; a "); ); key bits ||� ); ); a \�� */ ); ); }; visa Irma� ); ); ); ); } ); }; ); ); ;

djuna

2 days ago

At longer context, the website hang up on my device. :(

Disable streaming tokens in the settings. This will give you the whole answer for your prompt at the same time, in a in comparison faster time. At least that helped for me who had a similar problem.
g
@MadderHatterMax well, this workaround will take up a long time, I usually stop mid way when it's output is not good.

youssef38

2 days ago

my mother language isn't English and i didn't understood this so i need help
in a short answer only , according to this license is there any violation if i entered on behalf of my parents as i am a 17 :LLAMA 3.3 COMMUNITY LICENSE AGREEMENT :
"Licensee” or “you**” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.

Bouhabra

2 days ago

At longer context, the website hang up on my device. :(

Disable streaming tokens in the settings. This will give you the whole answer for your prompt at the same time, in a in comparison faster time. At least that helped for me who had a similar problem.
g
@MadderHatterMax well, this workaround will take up a long time, I usually stop mid way when it's output is not good.

Then use a ai like cohereforai or mistral which givez you the output in a short time

youssef38

1 day ago

my mother language isn't English and i didn't understood this so i need help
in a short answer only , according to this license is there any violation if i entered on behalf of my parents as i am a 17 :LLAMA 3.3 COMMUNITY LICENSE AGREEMENT :
"Licensee” or “you**” means you, or your employer or any other person or entity (if you are entering into this Agreement on such person or entity’s behalf), of the age required under applicable laws, rules or regulations to provide legal consent and that has legal authority to bind your employer or such other person or entity if you are entering in this Agreement on their behalf.

??

Phaser69

1 day ago

Yes youssef38, you can not use Llama. If you will, before your 18th birthday, you will be permanently banned on hugging chat, and Facebook (since it’s Meta’s model). Furthermore police and local sheriff will look for you to punish this severe crime.

If you will use other model (without 18+ clause in terms of usage) it’s all good. Forget about llama for now. Believe me.

SvCy

1 day ago

😄

Spaces:

huggingchat
/

chat-ui

Running

[MODELS] Discussion

Detailed review of Llama 3 70B:

Comparison with other models:

Detailed review of Llama 3 70B:

Detailed review of Llama 3 70B:

[New Model REQUEST] MTSAIR/MultiVerse_70B

[New Model REQUEST] MTSAIR/MultiVerse_70B

[New Model REQUEST] Microsoft/WizardLM-2

Review of Phi-3 Mini 4k Instruct:

Some useful Tips

Comparison with other models:

Overall:

[New Model REQUEST] Microsoft/WizardLM-2

Mistral Large 2

Hermes 3

Mistral Large 2

Hermes 3

Mistral Large 2

TL;DR