Barcenas Tiny 1.1b DPO
It is a model based on the famous TinyLlama/TinyLlama-1.1B-Chat-v1.0 and trained with DPO using the Intel/orca_dpo_pairs dataset.
With its reinforcement based training we hope to improve the Tiny model in a huge way and have a better model with better responses with a small size and accessible to most people.
Many thanks to Maxime Labonne (mlabonne) for his tutorial on how to train a LLM model using DPO, without his tutorial this model would not have been possible.
Made with ❤️ in Guadalupe, Nuevo Leon, Mexico 🇲🇽
- Downloads last month
- 1,220
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.