Problem with using Mistralai model api from Huggingface
Hi, I am using the Mistral-7B-Instruct-v0.1 model through Huggingface's api for question answering a pdf and its working but the response is not long...it gets cut halfway after one sentence . Please help.
I have the exact same problem, I can only get around 20-30 tokens in the response. I wonder if it's model internal limitation
I have the exact same problem, I can only get around 20-30 tokens in the response. I wonder if it's model internal limitation
Please let me know if you find a solution.
Can you attach an image of the code you are using for generation? Its working fine for me.
Can you also try with model.generate(**inputs, max_new_tokens = 350)
the default is 20
which might explain what's happening.
You can make POST request with inputs (and your query) and add "parameters" field. Increase max_new_tokens amount to get more text.
{
"inputs": inputs,
"parameters": {
"max_new_tokens": 100,
"temperature": 0.5,
"top_k": 40,
"top_p": 0.95,
"repetition_penalty": 1.1
}
}
``