piuzha commited on
Commit
64e4874
1 Parent(s): 6430c0f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -87,7 +87,7 @@ print(decoded[0])
87
 
88
  ## Evaluation
89
 
90
- We test the performance of our model with [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). The evaluation results on common datasets are shown below. We test on AI2 Reasoning Challenge (25-shot), HellaSwag (10-shot), MMLU (5-shot), and Winogrande (5-shot).
91
 
92
  | Models | ARC-C | Hellaswag | MMLU | WinoGrade | Ave |
93
  |:----------------------:|:-----:|:---------:|:-----:|:---------:|:-----:|
@@ -122,7 +122,16 @@ We also test the zero shot performance on AI2 Reasoning Challenge (0-shot), AI2
122
  | Moxin-7B-finetune | 80.03 | 75.17 | 82.24 | 81.12 | 58.64 | 75.44 |
123
 
124
 
 
125
 
 
 
 
 
 
 
 
 
126
 
127
 
128
 
 
87
 
88
  ## Evaluation
89
 
90
+ We test the performance of our model with [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). The evaluation results on common datasets are shown below. We test on AI2 Reasoning Challenge (25-shot), HellaSwag (10-shot), MMLU (5-shot), and Winogrande (5-shot). We release the Moxin-7B-finetuned as our base model. We further finetune our base model on Tulu v2 to obtain our chat model.
91
 
92
  | Models | ARC-C | Hellaswag | MMLU | WinoGrade | Ave |
93
  |:----------------------:|:-----:|:---------:|:-----:|:---------:|:-----:|
 
122
  | Moxin-7B-finetune | 80.03 | 75.17 | 82.24 | 81.12 | 58.64 | 75.44 |
123
 
124
 
125
+ ## Citation
126
 
127
+ ```
128
+ @article{zhao2024fully,
129
+ title={Fully Open Source Moxin-7B Technical Report},
130
+ author={Zhao, Pu and Shen, Xuan and Kong, Zhenglun and Shen, Yixin and Chang, Sung-En and Rupprecht, Timothy and Lu, Lei and Nan, Enfu and Yang, Changdi and He, Yumei and others},
131
+ journal={arXiv preprint arXiv:2412.06845},
132
+ year={2024}
133
+ }
134
+ ```
135
 
136
 
137