leaderboard-pr-bot commited on
Commit
c821bc3
1 Parent(s): c1b5ce7

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +106 -0
README.md CHANGED
@@ -118,6 +118,98 @@ model-index:
118
  source:
119
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
120
  name: Open LLM Leaderboard
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
121
  ---
122
 
123
  # MixtureofMerges-MoE-4x7b-v5
@@ -235,3 +327,17 @@ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-le
235
  |Winogrande (5-shot) |85.08|
236
  |GSM8k (5-shot) |69.75|
237
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
118
  source:
119
  url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
120
  name: Open LLM Leaderboard
121
+ - task:
122
+ type: text-generation
123
+ name: Text Generation
124
+ dataset:
125
+ name: IFEval (0-Shot)
126
+ type: HuggingFaceH4/ifeval
127
+ args:
128
+ num_few_shot: 0
129
+ metrics:
130
+ - type: inst_level_strict_acc and prompt_level_strict_acc
131
+ value: 41.99
132
+ name: strict accuracy
133
+ source:
134
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
135
+ name: Open LLM Leaderboard
136
+ - task:
137
+ type: text-generation
138
+ name: Text Generation
139
+ dataset:
140
+ name: BBH (3-Shot)
141
+ type: BBH
142
+ args:
143
+ num_few_shot: 3
144
+ metrics:
145
+ - type: acc_norm
146
+ value: 32.83
147
+ name: normalized accuracy
148
+ source:
149
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
150
+ name: Open LLM Leaderboard
151
+ - task:
152
+ type: text-generation
153
+ name: Text Generation
154
+ dataset:
155
+ name: MATH Lvl 5 (4-Shot)
156
+ type: hendrycks/competition_math
157
+ args:
158
+ num_few_shot: 4
159
+ metrics:
160
+ - type: exact_match
161
+ value: 7.1
162
+ name: exact match
163
+ source:
164
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
165
+ name: Open LLM Leaderboard
166
+ - task:
167
+ type: text-generation
168
+ name: Text Generation
169
+ dataset:
170
+ name: GPQA (0-shot)
171
+ type: Idavidrein/gpqa
172
+ args:
173
+ num_few_shot: 0
174
+ metrics:
175
+ - type: acc_norm
176
+ value: 4.59
177
+ name: acc_norm
178
+ source:
179
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
180
+ name: Open LLM Leaderboard
181
+ - task:
182
+ type: text-generation
183
+ name: Text Generation
184
+ dataset:
185
+ name: MuSR (0-shot)
186
+ type: TAUR-Lab/MuSR
187
+ args:
188
+ num_few_shot: 0
189
+ metrics:
190
+ - type: acc_norm
191
+ value: 12.34
192
+ name: acc_norm
193
+ source:
194
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
195
+ name: Open LLM Leaderboard
196
+ - task:
197
+ type: text-generation
198
+ name: Text Generation
199
+ dataset:
200
+ name: MMLU-PRO (5-shot)
201
+ type: TIGER-Lab/MMLU-Pro
202
+ config: main
203
+ split: test
204
+ args:
205
+ num_few_shot: 5
206
+ metrics:
207
+ - type: acc
208
+ value: 23.31
209
+ name: accuracy
210
+ source:
211
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=jsfs11/MixtureofMerges-MoE-4x7b-v5
212
+ name: Open LLM Leaderboard
213
  ---
214
 
215
  # MixtureofMerges-MoE-4x7b-v5
 
327
  |Winogrande (5-shot) |85.08|
328
  |GSM8k (5-shot) |69.75|
329
 
330
+
331
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
332
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_jsfs11__MixtureofMerges-MoE-4x7b-v5)
333
+
334
+ | Metric |Value|
335
+ |-------------------|----:|
336
+ |Avg. |20.36|
337
+ |IFEval (0-Shot) |41.99|
338
+ |BBH (3-Shot) |32.83|
339
+ |MATH Lvl 5 (4-Shot)| 7.10|
340
+ |GPQA (0-shot) | 4.59|
341
+ |MuSR (0-shot) |12.34|
342
+ |MMLU-PRO (5-shot) |23.31|
343
+