Pico-OpenLAiNN-250M 🤗

Hey there fellow researchers, developers, and AI enthusiasts! Today I'm releasing a new, slightly less smol open LLM. This LLM was trained on the full 32B tokens that the entire Open-PicoLAiNN family is trained on.

You can find the GGUF quants of this model here.

Models Overview

Pico-OpenLAiNN-100: The smallest of the bunch, this 100M parameter model is perfect for quick experiments and applications where computational resources are extremely limited.
Pico-OpenLAiNN-250: This is the middle child of the PicoLAiNN family, it's still tiny at 250M parameters but is more capable than the 100M parameter model.
Pico-OpenLAiNN-500: My current "Heavyweight" Model, this model has 500M parameters and is the most capable of the Pico-OpenLAiNN models.

Pretraining Details

This specific version of Pico LAiNN was trained on just 32B tokens of the fineweb dataset.

Other information:

Compatibility: Built to be compatible with existing projects that use LLAMA 2's tokenizer and architecture.
Ease of Use: No need to reinvent the wheel. These models are ready to be plugged into your applications.
Open Source: Fully open source, so you can tweak, tune, and twist them to your heart's content.

Getting Started

To start using these models, you can simply load them via the Hugging Face transformers library:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer


MODEL_NAME = "UUFO-Aigis/Pico-OpenLAiNN-250M" #Replace 100M with 250M or 500M if you prefer those models.

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)

def generate_text(prompt, model, tokenizer, max_length=512, temperature=1, top_k=50, top_p=0.95):
    inputs = tokenizer.encode(prompt, return_tensors="pt")

    outputs = model.generate(
        inputs,
        max_length=max_length,
        temperature=temperature,
        top_k=top_k,
        top_p=top_p,
        do_sample=True
    )


    generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return generated_text

def main():
    # Define your prompt
    prompt = "According to all known laws of aviation, there is no way a bee should be able to fly."

    generated_text = generate_text(prompt, model, tokenizer)

    print(generated_text)

if __name__ == "__main__":
    main()

Benchy :3

Tasks	Value		Stderr
arc_challenge	0.1988	±	0.0117
arc_easy	0.4503	±	0.0102
boolq	0.5907	±	0.0086
hellaswag	0.3215	±	0.0047
lambada_openai	0.3280	±	0.0065
piqa	0.6594	±	0.0111
winogrande	0.5028	±	0.0141

Future Plans

More Models: I'm currenetly training the bigger siblings of this models, including a 1B parameter version and beyond. 2-4 Billion parameter versions are planned. These will be Released as OpenLAiNN.
New architecture: This is still up in the air and I'm still developing it, and will release if I deem it to be actually useful, so stay tuned, this will likely be named FLaRE-LAiNN.
Paper: A detailed paper and the full source code will be made available for those interested in the details.

Credit Where Credit's Due

If you find these models useful and decide to use these models, a link to this repository would be highly appreciated. I am a one man show running this. Thanks 🤗

Contact

If you have questions, Please reach out to me at [email protected]

U.U.F.O Research Logo

UUFO-Aigis
/

Pico-OpenLAiNN-250M