Please submit this model to the Open LLM Leaderboard
The leaderboard is located here.
https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/
I performed a merge of two o1 models, including yours, and hit an unusually high MATH benchmark of 33.99%.
I posit that your model may be highly capable in mathematical reasoning despite the focus being on medical reasoning.
There's an issue with submitting it to the leaderboard, see here: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard/discussions/1055
Personal merge tests with this model showed very high BBH and MMLU-PRO benchmarks, so I'd expect Skywork has hidden math performance.
@grimjim The issue should be fixed on their end! Now we just need to upvote the model: https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard#/vote
If some votes could also be thrown at my models too I'd appreciate it! π΄