After training LlamaGuard-7b inference is slower

#23

by jamesoneill12 - opened Apr 3

Apr 3

Hi there,

I am having trouble meeting the same latency after training the model in bfloat16 (and other dtypes I assume).
Does LlamaGuard-7b have any tricks to make inference faster that could potentially lost after uploading a fine-tuned version of it to the hub ?

Thanks,
James

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment