SE6446
/

Tiny-llamix_2x1B

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

SE6446 commited on Jan 11, 2024

Commit

8ca2f4a

·

verified ·

1 Parent(s): 7a8b105

Create README.md

Files changed (1) hide show

README.md +55 -0

README.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+license: mit
+widget:
+- text: >
+    <|system|>
+    You are a chatbot who can help code!</s>
+    <|user|>
+    Write me a function to calculate the first 10 digits of the fibonacci
+    sequence in Python and print it out to the CLI.</s>
+    <|assistant|>
+library_name: transformers
+pipeline_tag: text-generation
+---
+# Tiny-llama
+## Model Description
+Tiny llamix is a model built from [TinyLlama](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) using [Charles Goddard's](https://github.com/cg123) mergekit on the mixtral branch.
+## Configuration
+```yaml
+base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
+gate_mode: hidden
+dtype: bfloat16
+experts:
+  - source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
+    positive_prompts:
+      - "M1"
+  - source_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
+    positive_prompts:
+     - "M2"
+```
+## Usage
+It can be used like any other model
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+#load model and tokenizer
+model = AutoModelForCausalLM.from_pretrained("SE6446/Tiny-llamix").to("cuda")
+tokenizer = AutoTokenizer.from_pretrained("SE6446/Tiny-llamix")
+#write and tokenize prompt
+instruction = '''<|system|>\nYou are a chatbot who can help code!</s>
+<|user|> Write me a function to calculate the first 10 digits of the fibonacci sequence in Python and print it out to the CLI.</s>
+<|assistant|>'''
+inputs = tokenizer(instruction, return_tensors="pt", return_attention_mask=False).to("cuda")
+#generate
+outputs = model.generate(**inputs, max_length=200)
+#print
+text = tokenizer.batch_decode(outputs)[0]
+print(text)
+```
+## Performance (coming soon!)