Thanks and questions

#3
by Kukedlc - opened

Hello @bartowski , first of all I must thank you for being so fast and effective in uploading the models in GGUF, I told you that in my company Gemma 9b runs in production, your quantized version. Thank you! Your work saves me time. The reason for this post is that I had problems with this model, it generates incoherent text. I tried the 8bit version and the 4bit K_M version, both give me an error, below is my model file, I thought it could be something from the chat template.

FROM /home/eugenio/Descargas/LongWriter-llama3.1-8b-Q4_K_M.gguf

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""

PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER stop "<|reserved_special_token"

I am using the latest version of Ubuntu and the latest version of Olama.

Well, progress, I realized that they used the mistral template [INST], now the model correctly generates the text but it does not generate the long-awaited 1000 story lines. Did something similar happen to you? Use the official Mistral chat template (taken from the modelfile)

Well I finish my monologue here, maybe it will help someone. The models work well. You need to update Ollama otherwise it gives an error when running it, but not when loading it. Second, the 4bit version I couldn't get it to generate long text, and when I did it it would cut off. The chat template I used is Mistral's (taken from the Ollama modelfile). The 8-bit version is a spectacle, I created a fairly acceptable alien story in Spanish on 8 pages in Word (ready to send to a seedy publisher). Thank you very much once again for your contributions.

Results:

imagen.png

imagen.png

imagen.png

Ah yes getting the template correct is definitely important, especially since this model uses a bit of a weird one with the <<SYS>> tags

Glad you figured it out and got it working nicely :D

I'd be curious, did you try Q4_K_L? I'm curious what part of the Q8 made it so much better, and if you can get some of the improvement by increasing the embeddings and output importance, that may be interesting and useful information!

I tried Q4_K_M, changed temperature, different chat templates, num_ctx and nothing. The 8-bit works well at first. The 4-bit generated short text for me and when I demanded that they be long, the inference was cut off at half

Hey Kukedlc,
I built the ollama model for the LongWriter-llama3.1-8b-Q8_0-GGUF with the modelfile as below, and it doesn't output answers when run in ollama. Please help me fix this or point out what I am doing wrong. I have very less knowledge of coding, and I mostly will be using these models for academic work. Thanks in advance!

ModelFile contents -

FROM C:/Users/Prajw/.ollama/models/modelsfiles/longwriter/longwriter-llama3.1-8b-q8_0.gguf
PARAMETER temperature 0.5
PARAMETER top_p 0.8
PARAMETER top_k 50
PARAMETER num_ctx 32768
PARAMETER repeat_penalty 1.15
PARAMETER stop USER:
PARAMETER stop ASSISTANT:

TEMPLATE """{{ if .System }}
<>
{{ .System }}
<>
{{ end }}

[INST]{{ index .Messages 0 "Content" }}[/INST]{{ index .Messages 0 "Response" }}
{{ range $i, $msg := .Messages }}
{{- if ne $i 0 -}}
[INST]{{ $msg.Content }}[/INST]{{ $msg.Response }}
{{- end -}}
{{ end }}"""

FROM C:/Users/Prajw/.ollama/models/modelsfiles/longwriter/longwriter-llama3.1-8b-q8_0.gguf

PARAMETER temperature 0.5
PARAMETER top_p 0.8
PARAMETER top_k 50
PARAMETER num_ctx 32768
PARAMETER repeat_penalty 1.15
PARAMETER stop [INST]
PARAMETER stop [/INST]

TEMPLATE """ {{- if .Messages }}
{{- range $index, $_ := .Messages }}
{{- if eq .Role "user" }}
{{- if and (eq (len (slice $.Messages $index)) 1) $.Tools }}[AVAILABLE_TOOLS] {{ $.Tools }}[/AVAILABLE_TOOLS]
{{- end }}[INST] {{ if and $.System (eq (len (slice $.Messages $index)) 1) }}{{ $.System }}

{{ end }}{{ .Content }}[/INST]
{{- else if eq .Role "assistant" }}
{{- if .Content }} {{ .Content }}
{{- else if .ToolCalls }}[TOOL_CALLS] [
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}}
{{- end }}]
{{- end }}
{{- else if eq .Role "tool" }}[TOOL_RESULTS] {"content": {{ .Content }}} [/TOOL_RESULTS]
{{- end }}
{{- end }}
{{- else }}[INST] {{ if .System }}{{ .System }}

{{ end }}{{ .Prompt }}[/INST]
{{- end }} {{ .Response }}
{{- if .Response }}
{{- end }}"""

IF IT DOESN'T WORK FOR YOU, CHECK IT OVER AND I'LL TAKE A LOOK AT THE MODEL FILE, THAT ONE SHOULD BE FINE."@deadice97

Many Thanks @Kukedlc This modelfile works perfectly fine!

Sign up or log in to comment