YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co./docs/hub/model-cards#model-card-metadata)
Pico-OpenLAiNN-250M-GGUF 🤗
Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing a new, slightly less smol open LLM. This LLM was trained on the full 32B tokens that the entire Open-PicoLAiNN family is trained on.
These are the GGUF quants of the models. For the original models, you can find them here.
Models Overview
- Pico-OpenLAiNN-100: The smallest of the bunch, this 100M parameter model is perfect for quick experiments and applications where computational resources are extremely limited.
- Pico-OpenLAiNN-250: This is the middle child of the PicoLAiNN family, it's still tiny at 250M parameters but is more capable than the 100M parameter model.
- Pico-OpenLAiNN-500: My current "Heavyweight" Model, this model has 500M parameters and is the most capable of the Pico-OpenLAiNN models.
Pretraining Details
This specific version of Pico LAiNN was trained on just 32B tokens of the fineweb dataset.
Other information:
- Compatibility: Built to be compatible with existing projects that use LLAMA 2's tokenizer and architecture.
- Ease of Use: No need to reinvent the wheel. These models are ready to be plugged into your applications.
- Open Source: Fully open source, so you can tweak, tune, and twist them to your heart's content.
Benchy :3
Tasks | Value | Stderr | |
---|---|---|---|
arc_challenge | 0.1988 | ± | 0.0117 |
arc_easy | 0.4503 | ± | 0.0102 |
boolq | 0.5907 | ± | 0.0086 |
hellaswag | 0.3215 | ± | 0.0047 |
lambada_openai | 0.3280 | ± | 0.0065 |
piqa | 0.6594 | ± | 0.0111 |
winogrande | 0.5028 | ± | 0.0141 |
Future Plans
- More Models: I'm currenetly training the bigger siblings of this models, including a 1B parameter version and beyond. 2-4 Billion parameter versions are planned. These will be Released as OpenLAiNN.
- New architecture: This is still up in the air and I'm still developing it, and will release if I deem it to be actually useful, so stay tuned, this will likely be named FLaRE-LAiNN.
- Paper: A detailed paper and the full source code will be made available for those interested in the details.
Credit Where Credit's Due
If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗
Contact
If you have questions, Please reach out to me at [email protected]
- Downloads last month
- 22