RonanMcGovern's picture
Create README.md
6246127 verified
metadata
license: apache-2.0
language:
  - en
pipeline_tag: text-generation
inference: true
tags:
  - mamba
  - bf16
  - 16bit
datasets:
  - cerebras/SlimPajama-627B

Mamba 2.8b Slim Pyjama - bf16 (16-bit)

This is a 16 bit version of Mamba-2.8b-slimpj

Mamba-2.8b-slimpj is a model using the Mamba architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset.

Model code: https://github.com/state-spaces/mamba/tree/main

To load the model, follow the installation instruction in the code repo, and then:

from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj")

Inference Notebook (Colab)