Text Generation
Transformers
Safetensors
llama
text-generation-inference
Inference Endpoints
leaderboard-pr-bot commited on
Commit
5ea0a40
1 Parent(s): d78d709

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co./spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +140 -8
README.md CHANGED
@@ -1,15 +1,133 @@
1
  ---
2
  license: apache-2.0
3
  widget:
4
- - example_title: "Yi-34B-Chat"
5
- text: "hi"
6
- output:
7
- text: " Hello! How can I assist you today?"
8
- - example_title: "Yi-34B"
9
- text: "There's a place where time stands still. A place of breath taking wonder, but also"
10
- output:
11
- text: " an eerie sense that something is just not right…\nBetween the two worlds lies The Forgotten Kingdom - home to creatures long since thought extinct and ancient magic so strong it defies belief! Only here can you find what has been lost for centuries: An Elixir Of Life which will restore youth and vitality if only those who seek its power are brave enough to face up against all manner of dangers lurking in this mysterious land! But beware; some say there may even exist powerful entities beyond our comprehension whose intentions towards humanity remain unclear at best ---- they might want nothing more than destruction itself rather then anything else from their quest after immortality (and maybe someone should tell them about modern medicine)? In any event though – one thing remains true regardless : whether or not success comes easy depends entirely upon how much effort we put into conquering whatever challenges lie ahead along with having faith deep down inside ourselves too ;) So let’s get started now shall We?"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
  <div align="center">
@@ -1410,3 +1528,17 @@ The code and weights of the Yi series models are distributed under the [Apache 2
1410
  <p align="right"> [
1411
  <a href="#top">Back to top ⬆️ </a> ]
1412
  </p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  widget:
4
+ - example_title: Yi-34B-Chat
5
+ text: hi
6
+ output:
7
+ text: ' Hello! How can I assist you today?'
8
+ - example_title: Yi-34B
9
+ text: There's a place where time stands still. A place of breath taking wonder,
10
+ but also
11
+ output:
12
+ text: ' an eerie sense that something is just not right…
13
+
14
+ Between the two worlds lies The Forgotten Kingdom - home to creatures long since
15
+ thought extinct and ancient magic so strong it defies belief! Only here can
16
+ you find what has been lost for centuries: An Elixir Of Life which will restore
17
+ youth and vitality if only those who seek its power are brave enough to face
18
+ up against all manner of dangers lurking in this mysterious land! But beware;
19
+ some say there may even exist powerful entities beyond our comprehension whose
20
+ intentions towards humanity remain unclear at best ---- they might want nothing
21
+ more than destruction itself rather then anything else from their quest after
22
+ immortality (and maybe someone should tell them about modern medicine)? In any
23
+ event though – one thing remains true regardless : whether or not success comes
24
+ easy depends entirely upon how much effort we put into conquering whatever challenges
25
+ lie ahead along with having faith deep down inside ourselves too ;) So let’s
26
+ get started now shall We?'
27
  pipeline_tag: text-generation
28
+ model-index:
29
+ - name: Yi-9B
30
+ results:
31
+ - task:
32
+ type: text-generation
33
+ name: Text Generation
34
+ dataset:
35
+ name: AI2 Reasoning Challenge (25-Shot)
36
+ type: ai2_arc
37
+ config: ARC-Challenge
38
+ split: test
39
+ args:
40
+ num_few_shot: 25
41
+ metrics:
42
+ - type: acc_norm
43
+ value: 61.18
44
+ name: normalized accuracy
45
+ source:
46
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=01-ai/Yi-9B
47
+ name: Open LLM Leaderboard
48
+ - task:
49
+ type: text-generation
50
+ name: Text Generation
51
+ dataset:
52
+ name: HellaSwag (10-Shot)
53
+ type: hellaswag
54
+ split: validation
55
+ args:
56
+ num_few_shot: 10
57
+ metrics:
58
+ - type: acc_norm
59
+ value: 78.82
60
+ name: normalized accuracy
61
+ source:
62
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=01-ai/Yi-9B
63
+ name: Open LLM Leaderboard
64
+ - task:
65
+ type: text-generation
66
+ name: Text Generation
67
+ dataset:
68
+ name: MMLU (5-Shot)
69
+ type: cais/mmlu
70
+ config: all
71
+ split: test
72
+ args:
73
+ num_few_shot: 5
74
+ metrics:
75
+ - type: acc
76
+ value: 70.06
77
+ name: accuracy
78
+ source:
79
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=01-ai/Yi-9B
80
+ name: Open LLM Leaderboard
81
+ - task:
82
+ type: text-generation
83
+ name: Text Generation
84
+ dataset:
85
+ name: TruthfulQA (0-shot)
86
+ type: truthful_qa
87
+ config: multiple_choice
88
+ split: validation
89
+ args:
90
+ num_few_shot: 0
91
+ metrics:
92
+ - type: mc2
93
+ value: 42.45
94
+ source:
95
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=01-ai/Yi-9B
96
+ name: Open LLM Leaderboard
97
+ - task:
98
+ type: text-generation
99
+ name: Text Generation
100
+ dataset:
101
+ name: Winogrande (5-shot)
102
+ type: winogrande
103
+ config: winogrande_xl
104
+ split: validation
105
+ args:
106
+ num_few_shot: 5
107
+ metrics:
108
+ - type: acc
109
+ value: 77.51
110
+ name: accuracy
111
+ source:
112
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=01-ai/Yi-9B
113
+ name: Open LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: GSM8k (5-shot)
119
+ type: gsm8k
120
+ config: main
121
+ split: test
122
+ args:
123
+ num_few_shot: 5
124
+ metrics:
125
+ - type: acc
126
+ value: 48.98
127
+ name: accuracy
128
+ source:
129
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=01-ai/Yi-9B
130
+ name: Open LLM Leaderboard
131
  ---
132
 
133
  <div align="center">
 
1528
  <p align="right"> [
1529
  <a href="#top">Back to top ⬆️ </a> ]
1530
  </p>
1531
+
1532
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
1533
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_01-ai__Yi-9B)
1534
+
1535
+ | Metric |Value|
1536
+ |---------------------------------|----:|
1537
+ |Avg. |63.17|
1538
+ |AI2 Reasoning Challenge (25-Shot)|61.18|
1539
+ |HellaSwag (10-Shot) |78.82|
1540
+ |MMLU (5-Shot) |70.06|
1541
+ |TruthfulQA (0-shot) |42.45|
1542
+ |Winogrande (5-shot) |77.51|
1543
+ |GSM8k (5-shot) |48.98|
1544
+