regarding the accuracy
#1
by
rakesh2222
- opened
How did you measure the accuracy (metric)?
It's the perplexity measure of a given quant: https://huggingface.co./docs/transformers/perplexity
I used llama-perplexity from here: https://github.com/ggml-org/llama.cpp/tree/master/examples/perplexity
The calibration data is https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8
It should be different from the one used for imatrix computations, but I haven't changed it yet. And since all measurements use the same imatrix and calibration data, they can be legitimately compared to each other. Just mind that accuracy in % is far from real world performance. It is simply an indicator, and I like to recommend 98.5-99% accuracy quants.