quantize parameters

#1
by rastegar - opened

Hi, can you share your quantize parameters, I have finetuned model try to quantize in awq and exlv2 I need best performance config for quantize my 32b model
Thanks in advance

also share what is minimum hardware requirement for quantize this 32b model into awq, please

TBH, this model was made long time ago, as far as I can recall, my parameters are listed below:

Hardware requirement:

AutoAWQ will firstly load the model into memory, and quantize the model layer by layer in VRAM, which means the whole model (fp/bf16 weights) should be fitted into your memory. You maybe need 64G+ memory, given that the original model occupies approximately 64G, and your system will consume some memory. As for VRAM, I didn't pay much attention to it. At least 16G VRAM I guess.

Sign up or log in to comment