imatrix

#2
by Bakanayatsu - opened

Hello, i did imatrix quants for this model! I like this a lot.

https://huggingface.co./Bakanayatsu/Fimbulvetr-Kuro-Lotus-10.7B-GGUF-imatrix

Owner

Thank you! I'll add it to the top of the card :3

saishf pinned discussion

I have tried the Imatrix versions of Bakanayatsu, but they seem to be corrupted, when I run them in LM Studio, the program crashes. On the other hand, this model is excellent. In fact, it is the best in its category, of all the ones I have tried. A marvel

Owner

I have tried the Imatrix versions of Bakanayatsu, but they seem to be corrupted, when I run them in LM Studio, the program crashes. On the other hand, this model is excellent. In fact, it is the best in its category, of all the ones I have tried. A marvel

Are you trying the IQ xs variants?
The IQ xs variants are pretty new and may not be supported by lm studio yet.

Owner

Screenshot 2024-03-18 033131(1).jpg
xs version running in koboldcpp

can i run this model with TensorRT-LLM

Owner

I don't believe gguf is supported by TensorRT but I'm not completely sure

FP16
FP8
INT8 & INT4 Weight-Only
SmoothQuant
Groupwise quantization (AWQ/GPTQ)
FP8 KV CACHE
INT8 KV CACHE (+ AWQ/per-channel weight-only)
Tensor Parallel
STRONGLY TYPED

This is all that's listed on TensorRTs' GitHub page under support matrix for llama

gotcha.. but this should still be supported by TensorRT right?

https://huggingface.co./saishf/Fimbulvetr-Kuro-Lotus-10.7B

Sign up or log in to comment