Inference on finetuned Mamba model

#11
by Kartik305 - opened

Using the [draft script] shared here: (https://huggingface.co./docs/transformers/main/en/model_doc/mamba2) ,
I have finetuned the mamba-codestral 7B model on custom data.
After saving the model using HF's save_pretrained method, I am unable to use the generate_mamba inference method due to the following error.

mamba_output = generate([tokens], model=Mamba.from_folder("<path_to_model>"), max_tokens=200, temperature=0.1)

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[12], line 1
----> 1 mamba_output = generate([tokens], model=Mamba.from_folder(model_id), max_tokens=200, temperature=0.1)

File ~/.local/lib/python3.10/site-packages/mistral_inference/mamba.py:71, in Mamba.from_folder(folder, max_batch_size, num_pipeline_ranks, device, dtype)
     63 @staticmethod
     64 def from_folder(
     65     folder: Union[Path, str],
   (...)
     69     dtype: Optional[torch.dtype] = None,
     70 ) -> "Mamba":
---> 71     with open(Path(folder) / "params.json", "r") as f:
     72         model_args = MambaArgs.from_dict(json.load(f))
     74     with torch.device("meta"):

FileNotFoundError: [Errno 2] No such file or directory: '<path_to_model>/params.json'

If I copy the params.json from the base model to the finetuned model directory, I get another error like so:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[24], line 1
----> 1 mamba_output = generate([tokens], model=Mamba.from_folder("\<path_to_model>"), max_tokens=200, temperature=0.1)

File ~/.local/lib/python3.10/site-packages/mistral_inference/mamba.py:79, in Mamba.from_folder(folder, max_batch_size, num_pipeline_ranks, device, dtype)
     75     model = Mamba(model_args)
     77 model_file = Path(folder) / "consolidated.safetensors"
---> 79 assert model_file.exists(), f"Make sure {model_file} exists."
     80 loaded = safetensors.torch.load_file(str(model_file))
     82 model.load_state_dict(loaded, assign=True, strict=True)

AssertionError: Make sure /<path_to_model>/consolidated.safetensors exists.

Having followed the draft script, Is there a way to load the trained model either using mistral-inference or transfomers inference?
Only difference is instead of PEFT, I did full finetuning of the model.

Sign up or log in to comment