Update README.md
Browse files
README.md
CHANGED
@@ -123,6 +123,8 @@ Thanks [Argilla](https://huggingface.co/argilla) for providing the dataset and t
|
|
123 |
|
124 |
## π Evaluation
|
125 |
|
|
|
|
|
126 |
The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) on Nous suite.
|
127 |
|
128 |
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|
@@ -136,6 +138,20 @@ The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/ll
|
|
136 |
|
137 |
You can find the complete benchmark on [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
|
138 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
139 |
## π» Usage
|
140 |
|
141 |
```python
|
@@ -166,16 +182,3 @@ print(outputs[0]["generated_text"])
|
|
166 |
<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>
|
167 |
</a>
|
168 |
</p>
|
169 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
170 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_mlabonne__NeuralDaredevil-7B)
|
171 |
-
|
172 |
-
| Metric |Value|
|
173 |
-
|---------------------------------|----:|
|
174 |
-
|Avg. |74.12|
|
175 |
-
|AI2 Reasoning Challenge (25-Shot)|69.88|
|
176 |
-
|HellaSwag (10-Shot) |87.62|
|
177 |
-
|MMLU (5-Shot) |65.12|
|
178 |
-
|TruthfulQA (0-shot) |66.85|
|
179 |
-
|Winogrande (5-shot) |82.08|
|
180 |
-
|GSM8k (5-shot) |73.16|
|
181 |
-
|
|
|
123 |
|
124 |
## π Evaluation
|
125 |
|
126 |
+
### Nous
|
127 |
+
|
128 |
The evaluation was performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoeval) on Nous suite.
|
129 |
|
130 |
| Model | Average | AGIEval | GPT4All | TruthfulQA | Bigbench |
|
|
|
138 |
|
139 |
You can find the complete benchmark on [YALL - Yet Another LLM Leaderboard](https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard).
|
140 |
|
141 |
+
# [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
142 |
+
|
143 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_mlabonne__NeuralDaredevil-7B)
|
144 |
+
|
145 |
+
| Metric |Value|
|
146 |
+
|---------------------------------|----:|
|
147 |
+
|Avg. |74.12|
|
148 |
+
|AI2 Reasoning Challenge (25-Shot)|69.88|
|
149 |
+
|HellaSwag (10-Shot) |87.62|
|
150 |
+
|MMLU (5-Shot) |65.12|
|
151 |
+
|TruthfulQA (0-shot) |66.85|
|
152 |
+
|Winogrande (5-shot) |82.08|
|
153 |
+
|GSM8k (5-shot) |73.16|
|
154 |
+
|
155 |
## π» Usage
|
156 |
|
157 |
```python
|
|
|
182 |
<img src="https://raw.githubusercontent.com/argilla-io/distilabel/main/docs/assets/distilabel-badge-light.png" alt="Built with Distilabel" width="200" height="32"/>
|
183 |
</a>
|
184 |
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|