Limited (truncated) response with inference API

#23
by RobertTaylor - opened

I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?

Mistral AI_ org

I am getting a limited output when using the inference API. This is also the case within the on-page example. Is HF's rate-limiting per-token?

Hi there, did you set max_new_tokens ?

Gosh, thanks for that. Sorry, I'm an idiot.

what are the other parameters?

Mistral AI_ org

Sign up or log in to comment