Update README.md
Browse files
README.md
CHANGED
@@ -13,14 +13,16 @@ license: apache-2.0
|
|
13 |
---
|
14 |
|
15 |
|
16 |
-
# Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
|
17 |
|
18 |
## News
|
19 |
-
-
|
|
|
20 |
- 2/7/2024 - [Serp-ai](https://github.com/serp-ai/Parameter-Efficient-MoE) adds [unsloth](https://github.com/serp-ai/unsloth) support for faster and memory efficient training of our Parameter-Efficient Sparsity Crafting and releases new [sparsetral](https://huggingface.co/serpdotai/sparsetral-16x7B-v2) models based on mistral-7B.
|
21 |
- 1/10/2024 - Camelidae models are now available on π€ [HuggingFace](https://huggingface.co/hywu).
|
22 |
-
- 1/4/2024 - We
|
23 |
-
- 12/22/2023 - We
|
|
|
24 |
## Introduction
|
25 |
Camelidae and Qwen2idae models are trained utilizing Parameter-Efficient Sparsity Crafting techniques
|
26 |
|
@@ -34,13 +36,10 @@ Specifically, Parameter-Efficient Sparsity Crafting utilizes parameter-efficient
|
|
34 |
Camelidae-8x7B | π€ [HuggingFace](https://huggingface.co/hywu/Camelidae-8x7B)
|
35 |
Camelidae-8x13B | π€ [HuggingFace](https://huggingface.co/hywu/Camelidae-8x13B)
|
36 |
Camelidae-8x34B | π€ [HuggingFace](https://huggingface.co/hywu/Camelidae-8x34B)
|
37 |
-
Camelidae-8x34B-pro | π€ Coming Soon
|
38 |
|
39 |
| Qwen2idae Series | Download
|
40 |
|---|---
|
41 |
Qwen2idae-16x14B-v1.0 | π€ [HuggingFace](https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0)
|
42 |
-
Qwen2idae-16x7B-v1.0 | π€ Coming Soon
|
43 |
-
Qwen2idae-16x1.8B-v1.0 | π€ Coming Soon
|
44 |
|
45 |
## Performance
|
46 |
| Model | Activated Params | MMLU (5shot) | GSM8k (5shot) | MATH (4shot) | HumanEval (0shot) | MBPP (4shot) | HellaSwag (10shot) |
|
|
|
13 |
---
|
14 |
|
15 |
|
16 |
+
# Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)
|
17 |
|
18 |
## News
|
19 |
+
- 9/20/2024 - Our paper is accepted by EMNLP'24.
|
20 |
+
- 3/12/2024 - We release Qwen2idae-16x14B-v1.0 on π€ [HuggingFace](https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0), which has strong performance in Math and Code with 15B activated params.
|
21 |
- 2/7/2024 - [Serp-ai](https://github.com/serp-ai/Parameter-Efficient-MoE) adds [unsloth](https://github.com/serp-ai/unsloth) support for faster and memory efficient training of our Parameter-Efficient Sparsity Crafting and releases new [sparsetral](https://huggingface.co/serpdotai/sparsetral-16x7B-v2) models based on mistral-7B.
|
22 |
- 1/10/2024 - Camelidae models are now available on π€ [HuggingFace](https://huggingface.co/hywu).
|
23 |
+
- 1/4/2024 - We release the paper, [Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks](https://arxiv.org/abs/2401.02731).
|
24 |
+
- 12/22/2023 - We release the training [repo](https://github.com/wuhy68/Parameter-Efficient-MoE) that craft the dense model with LLaMA architecture to the MoE model.
|
25 |
+
|
26 |
## Introduction
|
27 |
Camelidae and Qwen2idae models are trained utilizing Parameter-Efficient Sparsity Crafting techniques
|
28 |
|
|
|
36 |
Camelidae-8x7B | π€ [HuggingFace](https://huggingface.co/hywu/Camelidae-8x7B)
|
37 |
Camelidae-8x13B | π€ [HuggingFace](https://huggingface.co/hywu/Camelidae-8x13B)
|
38 |
Camelidae-8x34B | π€ [HuggingFace](https://huggingface.co/hywu/Camelidae-8x34B)
|
|
|
39 |
|
40 |
| Qwen2idae Series | Download
|
41 |
|---|---
|
42 |
Qwen2idae-16x14B-v1.0 | π€ [HuggingFace](https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0)
|
|
|
|
|
43 |
|
44 |
## Performance
|
45 |
| Model | Activated Params | MMLU (5shot) | GSM8k (5shot) | MATH (4shot) | HumanEval (0shot) | MBPP (4shot) | HellaSwag (10shot) |
|