v2ray commited on
Commit
4550fe1
·
verified ·
1 Parent(s): 8db1ef8

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -0
README.md ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ - zh
6
+ base_model:
7
+ - deepseek-ai/DeepSeek-V3
8
+ pipeline_tag: text-generation
9
+ library_name: transformers
10
+ ---
11
+ # DeepSeek V3 AWQ
12
+ AWQ of the DeepSeek V3 chat model.
13
+
14
+ This quant modified some of the model code to fix the overflow issue when using float16.
15
+
16
+ Tested on vLLM with 8x H100, inference speed 5 tokens/s with batch size 1 and short proompts.