Trelis
/

mamba-2.8b-slimpj-bf16

Text Generation

Inference Endpoints

Model card Files Files and versions Community

RonanMcGovern commited on Jan 29

Commit

6246127

•

1 Parent(s): db2580c

Create README.md

Files changed (1) hide show

README.md +29 -0

README.md ADDED Viewed

	@@ -0,0 +1,29 @@

+---
+license: apache-2.0
+language:
+- en
+pipeline_tag: text-generation
+inference: true
+tags:
+- mamba
+- bf16
+- 16bit
+datasets:
+- cerebras/SlimPajama-627B
+---
+# Mamba 2.8b Slim Pyjama - bf16 (16-bit)
+This is a 16 bit version of [Mamba-2.8b-slimpj](https://huggingface.co/state-spaces/mamba-2.8b-slimpj/)
+Mamba-2.8b-slimpj is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset.
+Model code: https://github.com/state-spaces/mamba/tree/main
+To load the model, follow the installation instruction in the code repo, and then:
+```
+from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
+model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj")
+```
+## Inference Notebook (Colab)
+- [Notebook here](https://colab.research.google.com/drive/1GsDbbkDTDpia_Dc8s-7bwEn_GrpkBVO4?usp=sharing)