Exllamav2 quant (exl2 / 8.0 bpw) made with ExLlamaV2 v0.1.3

Other EXL2 quants:

Quant Model Size lm_head
2.2
18625 MB
6
2.5
20645 MB
6
3.0
24211 MB
6
3.5
27784 MB
6
3.75
29572 MB
6
4.0
31359 MB
6
4.25
33139 MB
6
5.0
38500 MB
6
6.0
45805 MB
8
6.5
49410 MB
8
8.0
54655 MB
8
Downloads last month
7
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.