Model inference speed ....

by halsayed - opened Nov 16, 2023

Nov 16, 2023

Hi Core42 team,

Thanks for creating this opensource arabic model. I tested the model based on the provided example on the model card and I got very slow performance. I'm using A100 80GB, so I would expect a much better performance than the results shown on the image below. is this correct?

samta-kamboj

Inception org Nov 17, 2023

@halsayed Thanks for using Jais. You may get better inference speed using 2 x A100 80GB GPUs as the model size is ~(30x4)GB and all layers of the model could fit on 2 GPUs.

halsayed

Nov 17, 2023

@samta-kamboj thanks, increasing GPU solved the problem. Was there any attempt to quantize the model and reduce the vram footprint?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment