Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co./docs/hub/model-cards#model-card-metadata)

Pico-OpenLAiNN-100M-GGUF 🤗

Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing the full version of Pico-OpenLAiNN-100M, I had previously released a version before this that was trained on only 8B tokens. This LLM was trained on the full 32B tokens that the entire Open-PicoLAiNN family is trained on.

These are the GGUF quants of the models. For the original models, you can find them here.

Models Overview

  • Pico-OpenLAiNN-100: The smallest of the bunch, this 100M parameter model is perfect for quick experiments and applications where computational resources are extremely limited.
  • Pico-OpenLAiNN-250: This is the middle child of the PicoLAiNN family, it's still tiny at 250M parameters but is more capable than the 100M parameter model.
  • Pico-OpenLAiNN-500: My current "Heavyweight" Model, this model has 500M parameters and is the most capable of the Pico-OpenLAiNN models.

Pretraining Details

This specific version of Pico LAiNN was trained on just 32B tokens of the fineweb dataset.

Other information:

  • Compatibility: Built to be compatible with existing projects that use LLAMA 2's tokenizer and architecture.
  • Ease of Use: No need to reinvent the wheel. These models are ready to be plugged into your applications.
  • Open Source: Fully open source, so you can tweak, tune, and twist them to your heart's content.

Benchy :3

Tasks Value Stderr
arc_challenge 0.1826 ± 0.0113
arc_easy 0.4007 ± 0.0101
boolq 0.6012 ± 0.0086
hellaswag 0.2936 ± 0.0045
lambada_openai 0.2701 ± 0.0062
piqa 0.6338 ± 0.0112
winogrande 0.5099 ± 0.0140

Future Plans

  • More Models: I'm currenetly training the bigger siblings of this models, including a 1B parameter version and beyond. 2-4 Billion parameter versions are planned. These will be Released as OpenLAiNN.
  • New architecture: This is still up in the air and I'm still developing it, and will release if I deem it to be actually useful, so stay tuned, this will likely be named FLaRE-LAiNN.
  • Paper: A detailed paper made available for those interested in the details.

Credit Where Credit's Due

If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗

Contact

If you have questions, Please reach out to me at [email protected]

U.U.F.O Research Logo

Downloads last month
56
GGUF
Model size
104M params
Architecture
llama

4-bit

8-bit

16-bit

Inference API
Unable to determine this model's library. Check the docs .