--- base_model: - mistralai/Mistral-7B-Instruct-v0.2 - mistralai/Mistral-7B-Instruct-v0.1 tags: - mergekit - merge - moe license: apache-2.0 --- # Mistral Instruct MoE experimental This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit) using the `mixtral` branch. **This is an experimental model and has nothing to do with Mixtral. Mixtral is not a merge of models per se, but a transformer with MoE layers learned during training** This uses a random gate, so I expect not great results. We'll see! ## Merge Details ### Merge Method This model was merged using the MoE merge method. ### Models Merged The following models were included in the merge: * [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co./mistralai/Mistral-7B-Instruct-v0.2) * [mistralai/Mistral-7B-Instruct-v0.1](https://huggingface.co./mistralai/Mistral-7B-Instruct-v0.1) ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: mistralai/Mistral-7B-Instruct-v0.2 gate_mode: random dtype: bfloat16 experts: - source_model: mistralai/Mistral-7B-Instruct-v0.2 positive_prompts: [""] - source_model: mistralai/Mistral-7B-Instruct-v0.1 positive_prompts: [""] ```