Text Generation
Transformers
PyTorch
Safetensors
Japanese
gpt_neox
lm
nlp
text-generation-inference

Streaming

#2
by Oz-SIT - opened

Can this model be used in a streaming mode? Thanks in advance.

Hello. Can you elaborate on what you mean by a streaming model?

Thank you for your response. What I mean by 'a streaming mode' is that GPT responds to the client character by character as they are generated, rather than waiting until all the words are ready. ChatGPT offers an option "stream = True" as shown in https://til.simonwillison.net/gpt3/python-chatgpt-streaming-api.

Thank you for your help.

I think it is a matter of frontend UI instead of a matter of the model.

Though I personally don't have any practice implementing a streaming UI, it seems that HF has an under-development streaming feature:
https://huggingface.co./docs/transformers/main/en/generation_strategies#streaming
Hope it helps.

Thank you very much, Tianyuz! I will try to read the page, and implement a streaming function to Rinna.
Thanks again,

Oz

tianyuz changed discussion status to closed

Sign up or log in to comment