hywu commited on
Commit
1880e70
Β·
verified Β·
1 Parent(s): 32f1e62

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -7
README.md CHANGED
@@ -13,14 +13,16 @@ license: apache-2.0
13
  ---
14
 
15
 
16
- # Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
17
 
18
  ## News
19
- - 3/12/2024 - We released Qwen2idae-16x14B-v1.0 on πŸ€— [HuggingFace](https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0), which has strong performance in Math and Code with 15B activated params.
 
20
  - 2/7/2024 - [Serp-ai](https://github.com/serp-ai/Parameter-Efficient-MoE) adds [unsloth](https://github.com/serp-ai/unsloth) support for faster and memory efficient training of our Parameter-Efficient Sparsity Crafting and releases new [sparsetral](https://huggingface.co/serpdotai/sparsetral-16x7B-v2) models based on mistral-7B.
21
  - 1/10/2024 - Camelidae models are now available on πŸ€— [HuggingFace](https://huggingface.co/hywu).
22
- - 1/4/2024 - We released the paper, [Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks](https://arxiv.org/abs/2401.02731).
23
- - 12/22/2023 - We released the training [repo](https://github.com/wuhy68/Parameter-Efficient-MoE) that craft the dense model with LLaMA architecture to the MoE model.
 
24
  ## Introduction
25
  Camelidae and Qwen2idae models are trained utilizing Parameter-Efficient Sparsity Crafting techniques
26
 
@@ -34,13 +36,10 @@ Specifically, Parameter-Efficient Sparsity Crafting utilizes parameter-efficient
34
  Camelidae-8x7B | πŸ€— [HuggingFace](https://huggingface.co/hywu/Camelidae-8x7B)
35
  Camelidae-8x13B | πŸ€— [HuggingFace](https://huggingface.co/hywu/Camelidae-8x13B)
36
  Camelidae-8x34B | πŸ€— [HuggingFace](https://huggingface.co/hywu/Camelidae-8x34B)
37
- Camelidae-8x34B-pro | πŸ€— Coming Soon
38
 
39
  | Qwen2idae Series | Download
40
  |---|---
41
  Qwen2idae-16x14B-v1.0 | πŸ€— [HuggingFace](https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0)
42
- Qwen2idae-16x7B-v1.0 | πŸ€— Coming Soon
43
- Qwen2idae-16x1.8B-v1.0 | πŸ€— Coming Soon
44
 
45
  ## Performance
46
  | Model | Activated Params | MMLU (5shot) | GSM8k (5shot) | MATH (4shot) | HumanEval (0shot) | MBPP (4shot) | HellaSwag (10shot) |
 
13
  ---
14
 
15
 
16
+ # Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (EMNLP'24)
17
 
18
  ## News
19
+ - 9/20/2024 - Our paper is accepted by EMNLP'24.
20
+ - 3/12/2024 - We release Qwen2idae-16x14B-v1.0 on πŸ€— [HuggingFace](https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0), which has strong performance in Math and Code with 15B activated params.
21
  - 2/7/2024 - [Serp-ai](https://github.com/serp-ai/Parameter-Efficient-MoE) adds [unsloth](https://github.com/serp-ai/unsloth) support for faster and memory efficient training of our Parameter-Efficient Sparsity Crafting and releases new [sparsetral](https://huggingface.co/serpdotai/sparsetral-16x7B-v2) models based on mistral-7B.
22
  - 1/10/2024 - Camelidae models are now available on πŸ€— [HuggingFace](https://huggingface.co/hywu).
23
+ - 1/4/2024 - We release the paper, [Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks](https://arxiv.org/abs/2401.02731).
24
+ - 12/22/2023 - We release the training [repo](https://github.com/wuhy68/Parameter-Efficient-MoE) that craft the dense model with LLaMA architecture to the MoE model.
25
+
26
  ## Introduction
27
  Camelidae and Qwen2idae models are trained utilizing Parameter-Efficient Sparsity Crafting techniques
28
 
 
36
  Camelidae-8x7B | πŸ€— [HuggingFace](https://huggingface.co/hywu/Camelidae-8x7B)
37
  Camelidae-8x13B | πŸ€— [HuggingFace](https://huggingface.co/hywu/Camelidae-8x13B)
38
  Camelidae-8x34B | πŸ€— [HuggingFace](https://huggingface.co/hywu/Camelidae-8x34B)
 
39
 
40
  | Qwen2idae Series | Download
41
  |---|---
42
  Qwen2idae-16x14B-v1.0 | πŸ€— [HuggingFace](https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0)
 
 
43
 
44
  ## Performance
45
  | Model | Activated Params | MMLU (5shot) | GSM8k (5shot) | MATH (4shot) | HumanEval (0shot) | MBPP (4shot) | HellaSwag (10shot) |