Text Generation
Transformers
PyTorch
English
Chinese
llama
text-generation-inference
Inference Endpoints
MiniLoong-3B / README.md
GeneZC's picture
Update README.md
791cafa verified
metadata
language:
  - en
  - zh
license: apache-2.0
library_name: transformers
datasets:
  - EleutherAI/pile
  - togethercomputer/RedPajama-Data-1T
  - p208p2002/wudao
widget:
  - text: <s> 4 + 3 =

MiniLoong-3B

πŸ“‘ arXiv | πŸ‘» GitHub | πŸ€— HuggingFace-MiniMA-3B | πŸ€— HuggingFace-MiniChat-3B | πŸ€– ModelScope-MiniMA-3B | πŸ€– ModelScope-MiniChat-3B | πŸ€— HuggingFace-MiniChat-1.5-3B | πŸ€— HuggingFace-MiniMA-2-3B | πŸ€— HuggingFace-MiniChat-2-3B | πŸ€— HuggingFace-MiniMA-2-1B | πŸ€— HuggingFace-MiniLoong-3B | πŸ€— HuggingFace-MiniMix-2/4x3B

❗ Must comply with LICENSE of LLaMA-2 since it is derived from LLaMA-2.

teaser_d

Bibtex

@article{zhang2023law,
    title={Towards the Law of Capacity Gap in Distilling Language Models},
    author={Zhang, Chen and Song, Dawei and Ye, Zheyu and Gao, Yan},
    year={2023},
    url={https://arxiv.org/abs/2311.07052}
}