--- license: apache-2.0 tags: - moe - frankenmoe - merge - mergekit - lazymergekit - cognitivecomputations/dolphin-2_6-phi-2 - rhysjones/phi-2-orange base_model: - cognitivecomputations/dolphin-2_6-phi-2 - rhysjones/phi-2-orange --- # PhiMiX-2x2B-raw ## Code is work in progress # This is a RAW MoE meant to be finetuned
PhiMiX-2x2B is a Mixure of Experts (MoE) made with the following models using mergekit: * [cognitivecomputations/dolphin-2_6-phi-2](https://huggingface.co./cognitivecomputations/dolphin-2_6-phi-2) * [rhysjones/phi-2-orange](https://huggingface.co./rhysjones/phi-2-orange) ## ©️ Credits * [mlabonne's phixtral](https://huggingface.co./mlabonne/phixtral-4x2_8) for the PhiConfig and inference code. * [mergekit](https://github.com/cg123/mergekit) code which I tweaked (you can find the PhiConfig [here](https://github.com/cg123/mergekit/blob/508348ae34be17ea0a95d0a288a6e34491a2558a/mergekit/architecture.py#L289)) by mainly adding the config in the `moe_mixtral.py` script from `mixtral` branch. ## 🧩 Configuration ```yaml base_model: rhysjones/phi-2-orange gate_mode: random dtype: float16 experts: - source_model: cognitivecomputations/dolphin-2_6-phi-2 positive_prompts: [""] - source_model: rhysjones/phi-2-orange positive_prompts: [""] ``` ## 💻 Usage ```python !pip install -qU transformers bitsandbytes accelerate from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer import torch model_name = "paulilioaica/PhiMiX-2x2B-raw" torch.set_default_device("cuda") config = AutoConfig.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_config(config, trust_remote_code=True) instruction = ''' def print_prime(n): """ Print all primes between 1 and n """ ''' tokenizer = AutoTokenizer.from_pretrained( f"{model_name}", trust_remote_code=True ) # Tokenize the input string inputs = tokenizer( instruction, return_tensors="pt", return_attention_mask=False ) # Generate text using the model outputs = model.generate(**inputs, max_length=200) # Decode and print the output text = tokenizer.batch_decode(outputs)[0] print(text) ```