regarding the accuracy

#1
by rakesh2222 - opened

How did you measure the accuracy (metric)?

It's the perplexity measure of a given quant: https://huggingface.co./docs/transformers/perplexity

I used llama-perplexity from here: https://github.com/ggml-org/llama.cpp/tree/master/examples/perplexity

The calibration data is https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8
It should be different from the one used for imatrix computations, but I haven't changed it yet. And since all measurements use the same imatrix and calibration data, they can be legitimately compared to each other. Just mind that accuracy in % is far from real world performance. It is simply an indicator, and I like to recommend 98.5-99% accuracy quants.

Sign up or log in to comment