--- license: apache-2.0 language: - en pipeline_tag: text-generation inference: true tags: - mamba - bf16 - 16bit datasets: - cerebras/SlimPajama-627B --- # Mamba 2.8b Slim Pyjama - bf16 (16-bit) This is a 16 bit version of [Mamba-2.8b-slimpj](https://huggingface.co./state-spaces/mamba-2.8b-slimpj/) Mamba-2.8b-slimpj is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset. Model code: https://github.com/state-spaces/mamba/tree/main To load the model, follow the installation instruction in the code repo, and then: ``` from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj") ``` ## Inference Notebook (Colab) - [Notebook here](https://colab.research.google.com/drive/1GsDbbkDTDpia_Dc8s-7bwEn_GrpkBVO4?usp=sharing)