New discussion

Serving with TGI or vLLM?

1
#3 opened about 1 year ago by
kno10

only use one gpu?

2
#2 opened about 1 year ago by
jgbrblmd

persist dequantized model

1
#1 opened about 1 year ago by
nudelbrot