Please submit this model to the Open LLM Leaderboard

by grimjim - opened 5 days ago

5 days ago

•

The leaderboard is located here.
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/

I performed a merge of two o1 models, including yours, and hit an unusually high MATH benchmark of 33.99%.
I posit that your model may be highly capable in mathematical reasoning despite the focus being on medical reasoning.

T145

5 days ago

There's an issue with submitting it to the leaderboard, see here: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard/discussions/1055

Personal merge tests with this model showed very high BBH and MMLU-PRO benchmarks, so I'd expect Skywork has hidden math performance.

T145

1 day ago

@grimjim The issue should be fixed on their end! Now we just need to upvote the model: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/vote

If some votes could also be thrown at my models too I'd appreciate it! 🌴

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment