radm commited on
Commit
53a8a43
·
verified ·
1 Parent(s): 414547b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -18,6 +18,7 @@ base_model:
18
 
19
  - AWQ 4bit version of [Nexusflow/Athene-V2-Chat](https://huggingface.co/Nexusflow/Athene-V2-Chat)
20
  - [Quantization code](https://docs.vllm.ai/en/latest/quantization/auto_awq.html)
 
21
 
22
  ## Eval AWQ version
23
 
 
18
 
19
  - AWQ 4bit version of [Nexusflow/Athene-V2-Chat](https://huggingface.co/Nexusflow/Athene-V2-Chat)
20
  - [Quantization code](https://docs.vllm.ai/en/latest/quantization/auto_awq.html)
21
+ - This model [only fits to 1 gpu](https://huggingface.co/radm/Athene-V2-Chat-AWQ/discussions/2). Use [kosbu/Athene-V2-Chat-AWQ](kosbu/Athene-V2-Chat-AWQ) for multi-gpu support
22
 
23
  ## Eval AWQ version
24