llama.cpp-b2234 generated not good output

by wyklq - opened Feb 22

Feb 22

It looks like the GGUF file was not compatible with llama.cpp-b2234 release.
I tried "gemma-7b-it-Q4_K_M.gguf", with prompt "write a python program to caculate pi with monte carlo method".
Its output is worse than "gemma-2b-it-Q4_K_M.gguf" from another repository.

apepkuss79

Second State org Feb 22

@wyklq The gguf models are generated with b2230 and also tested against b2230. There are some changes introduced into llama.cpp after b2230, so we are not sure if they are compatible with b2234. But anyway, we'll track the changes on llama.cpp and update the models in the near future.
In addition, according to my personal experience, 2b-it-Q8_0 is better. You can try it.

wyklq

Feb 23

OK, it turns to be the original model's issue.
I found the discussion https://huggingface.co./google/gemma-7b-it/discussions/38
And the workaround works, i.e. set "Presence penalty" to 1.

wyklq changed discussion status to closed Feb 23

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment