Looks like someone else submitted sambanovasystems/SambaLingo-Arabic-Base with wrong precision

#2
by zolicsaki - opened

Pictured below is sambanovasystems/SambaLingo-Arabic-Base with FP32 precision - I re-submitted it with the correct bf16 precision in the queue and now both are in the queue

image.png

zolicsaki changed discussion title from Looks like someone else submitted sambanovasystems/SambaLingo-Arabic-Chat with wrong precision to Looks like someone else submitted sambanovasystems/SambaLingo-Arabic-Base with wrong precision
Open Arabic LLM Leaderboard org

Pictured below is sambanovasystems/SambaLingo-Arabic-Base with FP32 precision - I re-submitted it with the correct bf16 precision in the queue and now both are in the queue

image.png

@zolicsaki
Thank you for keeping an eye on the leaderboard ๐Ÿค—
I see you are a member of SambaNovaSystems, glad to have you here. As for the SambaLingo-Arabic-Base, i believe the correct precision is float32 indeed, i simply checked the config here So i will remove the newly submission made with bf16 precision from requests. Nevertheless, i saw that you guys merged my PR (auto) for safetensors, does this PR changed the 70B version from float32 to bf16 ? Because now i see it bf16 but i remember it was f32 !? Anyway please feel free to add these models to queue with the correct precision and I'll make sure to delete the wrong one ๐Ÿค—

@Ali-C137 Thank you so much - all the models in the queue look correct now

@Ali-C137 Hey just checked back in and it looks like the queue has completed, but the SambaLingo models evaluation results are not there, any ideas on why? Thank you so much!

Also just curios whether the chat templates are applied for chat models when running the evaluation?

Open Arabic LLM Leaderboard org

Dear @zolicsaki , unfortunately we have about 50 models that failled to be evaluated, we are investigating the matter and will fix it from our side if we can otherwise we will contact the authors of the models with insights to fix anything that needs to be fixed from their side

@Ali-C137 Thank you! I am the author of these SambaLingo models - please let me know if you need anything

Open Arabic LLM Leaderboard org

@zolicsaki SambaLingo-Arabic-Chat is on the leaderboard ๐Ÿ”ฅ
The base model is still under maintenance and will join the queue soon ๐Ÿค—

@Ali-C137 Thank you so much! Are the 70B parameter versions also going to make it on there?

Open Arabic LLM Leaderboard org

@zolicsaki We are trying to make every model land on the leaderboard, i will personally contact you if we had an issue with one of your models that we couldn't resolve

Hi @Ali-C137 any updates on SambaLingo 70B?

image.png

Open Arabic LLM Leaderboard org

dear @zolicsaki

You can always check status here : https://huggingface.co./datasets/OALL/requests/blob/main/sambanovasystems/SambaLingo-Arabic-Base-70B_eval_request_False_float32_Original.json

It is running and we expect it to land by tomorrow since bigger models succeeded in the last couple days ... even tho we do not guarantee anything yet since we encountered some weird errors with other models based on llama2

Open Arabic LLM Leaderboard org

Hi dear @zolicsaki
Apparently the 70B models with the float32 precision requires way more time than allowed ! Therefore we will need you guys to to provide a float16 or bfloat16 version of the model in order to be able to evaluate it on time. We can always cast it ourselves but we are afraid that this might create a confusion for the users of your model so it would be better to provide a half-precision version.
Please let us know what works better for you and we would be happy to help.

alielfilali01 changed discussion status to closed

Sign up or log in to comment