Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
Wanted to call out attention to this error that is now appearing when running dolphin-mixtral:
Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.ffn_down_exps.weight'
According to this thread here: https://github.com/ollama/ollama/issues/8147; this has something to do with Ollama changing how it handles MoE models recently. Wanted to call out attention to this error in case the GGUF version needed to be looked at as well and updated once the "main" version got its fix.
Someone over at llama.cpp thought it would be a great idea to break all the Mixtral-8x7B quants: https://github.com/ggerganov/llama.cpp/issues/10244
Here is a version that currently (December 2024) works: https://huggingface.co./mradermacher/dolphin-2.7-mixtral-8x7b-GGUF