FPHam commited on
Commit
fae6b3c
1 Parent(s): 8bdf7c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -136,4 +136,17 @@ After probably 10 different versions with subsequent changes, I can now say that
136
 
137
  The goal was to create a model that wouldn't change the style of the text. Often, LLM models, when asked to edit text, will attempt to rewrite the text even if the text is already fine. This proved to be quite challenging for such a small model where the main task was to determine the right balance between fixing the text (and not changing its style) and copying it verbatim.
138
 
139
- The strict model assumes that you're already a good writer that doesn't need hand-holding and that every word you've written you've meant.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
136
 
137
  The goal was to create a model that wouldn't change the style of the text. Often, LLM models, when asked to edit text, will attempt to rewrite the text even if the text is already fine. This proved to be quite challenging for such a small model where the main task was to determine the right balance between fixing the text (and not changing its style) and copying it verbatim.
138
 
139
+ The strict model assumes that you're already a good writer that doesn't need hand-holding and that every word you've written you've meant.
140
+
141
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
142
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_FPHam__Karen_TheEditor_V2_STRICT_Mistral_7B)
143
+
144
+ | Metric |Value|
145
+ |---------------------------------|----:|
146
+ |Avg. |59.13|
147
+ |AI2 Reasoning Challenge (25-Shot)|59.56|
148
+ |HellaSwag (10-Shot) |81.79|
149
+ |MMLU (5-Shot) |59.56|
150
+ |TruthfulQA (0-shot) |49.36|
151
+ |Winogrande (5-shot) |74.35|
152
+ |GSM8k (5-shot) |30.17|