Serving with TGI or vLLM?
1
#3 opened about 1 year ago
by
kno10
only use one gpu?
2
#2 opened about 1 year ago
by
jgbrblmd
persist dequantized model
1
#1 opened about 1 year ago
by
nudelbrot