THIS MODEL IS EXPERIMENTAL AND MIGHT BE BUGGY, I DIDN'T PERFECT THE STRENGTH OF DPO AND SFT YET.
Submitting to Open LLM leaderboard with base model yi-34b-200k-llamafied to see whether there's a point in merging a lora over a lora if both have the same lora_r or if it doesn't matter.

Another AEZAKMI v2 finetune over Yi-34B-200K-rawrr-r3. Sequence length 2200 I was able to squeeze that in using Unsloth, script I used is in this repo. Training took around 18 hours on local RTX 3090 Ti. Will be uploading fp16 and exl2 soon. So far it seems like de-contaminating Yi worked nicely. This lora goes over Yi-34B-200K-rawrr1-LORA-DPO-experimental-r3 lora. So first get Yi-34B-200K llamafied, merge in Yi-34B-200K-rawrr1-LORA-DPO-experimental-r3, then merge in this lora.

Credits for mlabonne (I was using his Mistral fine-tuning script pieces for dataset preparation), Daniel Han and Michael Han (Unsloth AI team)

made with Unsloth

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.