--- language: - en - zh license: apache-2.0 library_name: transformers datasets: - EleutherAI/pile - togethercomputer/RedPajama-Data-1T - p208p2002/wudao widget: - text: 4 + 3 = --- ## MiniLoong-3B 📑 [arXiv](https://arxiv.org/abs/2311.07052) | 👻 [GitHub](https://github.com/GeneZC/MiniMA) | 🤗 [HuggingFace-MiniMA-3B](https://huggingface.co./GeneZC/MiniMA-3B) | 🤗 [HuggingFace-MiniChat-3B](https://huggingface.co./GeneZC/MiniChat-3B) | 🤖 [ModelScope-MiniMA-3B](https://modelscope.cn/models/GeneZC/MiniMA-3B) | 🤖 [ModelScope-MiniChat-3B](https://modelscope.cn/models/GeneZC/MiniChat-3B) | 🤗 [HuggingFace-MiniChat-1.5-3B](https://huggingface.co./GeneZC/MiniChat-1.5-3B) | 🤗 [HuggingFace-MiniMA-2-3B](https://huggingface.co./GeneZC/MiniMA-2-3B) | 🤗 [HuggingFace-MiniChat-2-3B](https://huggingface.co./GeneZC/MiniChat-2-3B) | 🤗 [HuggingFace-MiniMA-2-1B](https://huggingface.co./GeneZC/MiniMA-2-1B) | 🤗 [HuggingFace-MiniLoong-3B](https://huggingface.co./GeneZC/MiniLoong-3B) | 🤗 [HuggingFace-MiniMix-2/4x3B](https://huggingface.co./GeneZC/MiniMix-2_4x3B) ❗ Must comply with LICENSE of LLaMA-2 since it is derived from LLaMA-2. teaser_d ## Bibtex ```bibtex @article{zhang2023law, title={Towards the Law of Capacity Gap in Distilling Language Models}, author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan}, year={2023}, url={https://arxiv.org/abs/2311.07052} } ```