Has a massive repetition problem
Even in the demo included on huggingface, it has a massive rep problem. If it's left to write out a document or anything of any complexity it usually tails off into repetition even with a higher penalty.
"My name is Teven and I am a 17 year old student from the Netherlands. I am currently in my last year of high school and I am planning to study computer science at the university. I have been interested in computers and technology for as long as I can remember. I have always been fascinated by the way computers work and how they can be used to solve problems.
I have been programming since I was 12 years old and I have always enjoyed it. I have used a variety of programming languages, including Python, Java, and C++. I have also used a variety of software development tools, including Eclipse, Visual Studio, and NetBeans.
I have always been interested in learning new things and I am always looking for new challenges. I am always looking for ways to improve my skills and I am always looking for new ways to use my skills. I am always looking for new ways to use my skills to help others.
I am always looking for new ways to use my skills to help others. I am always looking for new ways to use my skills to"
I noticed this too. Increasing the temperature helps
Does it keep on generating text until it reaches the max_generated_tokens every time?
I have problem with generating text until it reaches the max_generated_tokens on Tesla V100S GPU, but Colab A100 works ok. Any idea what can be done with this?
I have problem with generating text until it reaches the max_generated_tokens on Tesla V100S GPU, but Colab A100 works ok. Any idea what can be done with this?
You are running the exact same script + environment + component versions and you are getting different results with different gpus?
You are running the exact same script + environment + component versions and you are getting different results with different gpus?
Ok, I double checked this. The problem occurred with my qlora finetuned model. Base model works ok on both gpu.
Yeah, I am also running into the same issue even after increasing the temperature and penalty. Is this the same problem with Mistral-7b-v0.2 or with the Mistral-7b-instruct model?
Hi, remember to set PAD token != EOS during finetuning.
Hi, remember to set PAD token != EOS during finetuning.
Hi could you explain more, thanks!
During fine tuning model can 'forget' about eos token when we set it as pad token. It's because of masking - we don't learn model to predict padding.
IDN what you use for fine tune, but setting this helped me with repetition:
tokenizer = AutoTokenizer.from_pretrained(
base_model_id,
model_max_length=4096,
padding_side="left",
add_eos_token=True # add eos at the end of text
)
# tokenizer.pad_token = tokenizer.eos_token # don't use this
tokenizer.pad_token = tokenizer.unk_token # use this
model.config.pad_token_id = tokenizer.pad_token_id
My dummy fine-tuning sample:
https://github.com/atadria/llm_calculator/blob/main/mistral_finetune.ipynb
What value did you set for pad_token_id? Could you explain more on what the pad_token_id does?
I also read this article: https://huggingface.co./NousResearch/Yarn-Mistral-7b-128k/discussions/3, but I am confused between what value to set for eos_token_id and pad_token_id. Has anyone played around with these two parameters and got the issue resolves? Please let me know.
Hi @Rmote6603 did you manage to solver this issue? Facing the same problem myself. Thanks!