Undi95/Meta-Llama-3.1-8B-Claude-GGUF · error when loading this model

Jul 25

•

Getting the following error when trying to open this model on LM Studio. On Kobold it simply crashes when loading the model

{
  "title": "Failed to load model",
  "cause": "llama.cpp error: 'done_getting_tensors: wrong number of tensors; expected 292, got 291'",
  "errorData": {
    "n_ctx": 8192,
    "n_batch": 512,
    "n_gpu_layers": 33
  },
  "data": {
    "memory": {
      "ram_capacity": "63.68 GB",
      "ram_unused": "41.33 GB"
    },
    "gpu": {
      "gpu_names": [
        "NVIDIA GeForce RTX 3090"
      ],
      "vram_recommended_capacity": "24.00 GB",
      "vram_unused": "22.76 GB"
    },
    "os": {
      "platform": "win32",
      "version": "10.0.19045"
    },
    "app": {
      "version": "0.2.28",
      "downloadsDir": "F:\\Vortex_mods_extra\\Herika\\LLMs"
    },
    "model": {}
  }
}

Mykee

Jul 25

oobabooga also gave an error for the Q4_K_M model. :/

Undi95

Owner Jul 25

•

edited Jul 25

Llama.cpp didn't merged the fix with the main branch, you need to wait for :

Llama.cpp to merge to main
then for Koboldcpp, Oobabooga or LM Studio to update the version of Llama.cpp they use to load GGUF file.

If you want to run them right now, you can use this PR : https://github.com/ggerganov/llama.cpp/pull/8676 and apply it to llama.cpp yourself.
The GGUF was made to be usable when the fix will be applied!

Undi95 pinned discussion Jul 25