Spaces:

openlifescienceai
/

open_medical_llm_leaderboard

Running on CPU Upgrade

aaditya commited on Jan 29

Commit

60b3fb3

verified ·

1 Parent(s): cd5e43b

Update src/about.py

Files changed (1) hide show

src/about.py CHANGED Viewed

@@ -58,6 +58,7 @@ The backend of the Open Medical LLM Leaderboard uses the Eleuther AI Language Mo
 The <a href="https://arxiv.org/abs/2303.13375">GPT-4</a>, and <a href="https://arxiv.org/abs/2305.09617">Med-PaLM-2</a> results are taken from their official papers. Since Med-PaLM doesn't provide zero-shot accuracy, we are using 5-shot accuracy from their paper for comparison. All results presented are in the zero-shot setting, except for Med-PaLM-2 which use 5-shot accuracy. Gemini results are taken from recent Clinical-NLP <a href="https://arxiv.org/abs/2402.07023">(NAACL 24) Paper</a>
 """
 LLM_BENCHMARKS_TEXT = f"""

 The <a href="https://arxiv.org/abs/2303.13375">GPT-4</a>, and <a href="https://arxiv.org/abs/2305.09617">Med-PaLM-2</a> results are taken from their official papers. Since Med-PaLM doesn't provide zero-shot accuracy, we are using 5-shot accuracy from their paper for comparison. All results presented are in the zero-shot setting, except for Med-PaLM-2 which use 5-shot accuracy. Gemini results are taken from recent Clinical-NLP <a href="https://arxiv.org/abs/2402.07023">(NAACL 24) Paper</a>
+Model Availability Requirement: To maintain the integrity of the leaderboard, only models that are actively accessible will be included. Submissions must be available either via an API or a public Hugging Face repository to allow validation of the reported results. If a model's repository is empty or its API is inaccessible, the submission will be removed from the leaderboard, as the primary goal is to ensure that models listed here remain accessible for evaluation and comparison.
 """
 LLM_BENCHMARKS_TEXT = f"""