Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,7 @@ base_model:
|
|
18 |
|
19 |
- AWQ 4bit version of [Nexusflow/Athene-V2-Chat](https://huggingface.co/Nexusflow/Athene-V2-Chat)
|
20 |
- [Quantization code](https://docs.vllm.ai/en/latest/quantization/auto_awq.html)
|
|
|
21 |
|
22 |
## Eval AWQ version
|
23 |
|
|
|
18 |
|
19 |
- AWQ 4bit version of [Nexusflow/Athene-V2-Chat](https://huggingface.co/Nexusflow/Athene-V2-Chat)
|
20 |
- [Quantization code](https://docs.vllm.ai/en/latest/quantization/auto_awq.html)
|
21 |
+
- This model [only fits to 1 gpu](https://huggingface.co/radm/Athene-V2-Chat-AWQ/discussions/2). Use [kosbu/Athene-V2-Chat-AWQ](kosbu/Athene-V2-Chat-AWQ) for multi-gpu support
|
22 |
|
23 |
## Eval AWQ version
|
24 |
|