Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'

#4
by fearofthedank - opened

Wanted to call out attention to this error that is now appearing when running dolphin-mixtral:

Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'

According to this thread here: https://github.com/ollama/ollama/issues/8147; this has something to do with Ollama changing how it handles MoE models recently. Wanted to call out attention to this error in case the GGUF version needed to be looked at as well and updated once the "main" version got its fix.

Someone over at llama.cpp thought it would be a great idea to break all the Mixtral-8x7B quants: https://github.com/ggerganov/llama.cpp/issues/10244

Here is a version that currently (December 2024) works: https://huggingface.co./mradermacher/dolphin-2.7-mixtral-8x7b-GGUF

Sign up or log in to comment