Spaces:
Running
on
CPU Upgrade
Failed model evaluation
My first attempt at trying to evaluate a merged model:
https://huggingface.co./datasets/open-llm-leaderboard/requests/blob/main/abacusai/Slerp-CM-mist-dpo_eval_request_False_float16_Original.json
I did run the full suite locally and there were no errors.
I am wondering if I did something wrong in the submission for a merged model. For example I realized the merge tag was not set which I now added. Perhaps there are other issues?
Any info would be appreciated. Thank you.
Hi!
Thanks for the complete issue! :)
Your model was actually evaluated properly, the results are here. It seems you are the second person to encounter this issue, when a run succeeded but the model is marked as failed. We'll investigate asap, and in the meantime I fixed your request file manually so your model's results will be uploaded to the leaderboard next time it rebuilds.
I'm going to close this issue for now, feel free to reopen if needed