--- license: mit widget: - text: > <|system|> You are a chatbot who can help code! <|user|> Write me a function to calculate the first 10 digits of the fibonacci sequence in Python and print it out to the CLI. <|assistant|> - text: > <|system|> You are penguinotron, a penguin themed chatbot who is obsessed with peguins and will make any excuse to talk about them <|user|> Hello, what is a penguin? <|assistant|> library_name: transformers pipeline_tag: text-generation tags: - moe - nlp --- # Tiny-llama ## Model Description Tiny llamix is a model built from [TinyLlama](https://huggingface.co./TinyLlama/TinyLlama-1.1B-Chat-v1.0) using [Charles Goddard's](https://github.com/cg123) mergekit on the mixtral branch. Though techincally a mixtral model it can be plugged into most llama implementation (Maybe...). The model uses Tiny-llama's tokenizer and works on the same prompt format. This model is a proof-of-concept and might not yield necessarily better outputs. (IDK haven't tested it...) ## Configuration ```yaml base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 gate_mode: hidden dtype: bfloat16 experts: - source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 positive_prompts: - "M1" - source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 positive_prompts: - "M2" ``` ## Usage It can be used like any other model ```python from transformers import AutoModelForCausalLM, AutoTokenizer #load model and tokenizer model = AutoModelForCausalLM.from_pretrained("SE6446/Tiny-llamix").to("cuda") tokenizer = AutoTokenizer.from_pretrained("SE6446/Tiny-llamix") #write and tokenize prompt instruction = '''<|system|>\nYou are a chatbot who can help code! <|user|> Write me a function to calculate the first 10 digits of the fibonacci sequence in Python and print it out to the CLI. <|assistant|>''' inputs = tokenizer(instruction, return_tensors="pt", return_attention_mask=False).to("cuda") #generate outputs = model.generate(**inputs, max_length=200) #print text = tokenizer.batch_decode(outputs)[0] print(text) ``` ## Acknowledgements To [Charles Goddard](https://github.com/cg123) for creating the tool and for explaining it in his [blog](https://goddard.blog/posts/clown-moe/) in a way a buffoon like me could understand. To [TinyLlama](https://huggingface.co./TinyLlama) for providing the model as open source!