RonanMcGovern commited on
Commit
6246127
1 Parent(s): db2580c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-generation
6
+ inference: true
7
+ tags:
8
+ - mamba
9
+ - bf16
10
+ - 16bit
11
+ datasets:
12
+ - cerebras/SlimPajama-627B
13
+ ---
14
+ # Mamba 2.8b Slim Pyjama - bf16 (16-bit)
15
+
16
+ This is a 16 bit version of [Mamba-2.8b-slimpj](https://huggingface.co/state-spaces/mamba-2.8b-slimpj/)
17
+
18
+ Mamba-2.8b-slimpj is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset.
19
+
20
+ Model code: https://github.com/state-spaces/mamba/tree/main
21
+
22
+ To load the model, follow the installation instruction in the code repo, and then:
23
+ ```
24
+ from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
25
+ model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj")
26
+ ```
27
+
28
+ ## Inference Notebook (Colab)
29
+ - [Notebook here](https://colab.research.google.com/drive/1GsDbbkDTDpia_Dc8s-7bwEn_GrpkBVO4?usp=sharing)