Barcenas Tiny 1.1b DPO

It is a model based on the famous TinyLlama/TinyLlama-1.1B-Chat-v1.0 and trained with DPO using the Intel/orca_dpo_pairs dataset.

With its reinforcement based training we hope to improve the Tiny model in a huge way and have a better model with better responses with a small size and accessible to most people.

Many thanks to Maxime Labonne (mlabonne) for his tutorial on how to train a LLM model using DPO, without his tutorial this model would not have been possible.

Made with ❤️ in Guadalupe, Nuevo Leon, Mexico 🇲🇽

Downloads last month
1,220
Safetensors
Model size
1.1B params
Tensor type
FP16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for Danielbrdz/Barcenas-Tiny-1.1b-DPO

Quantizations
4 models

Dataset used to train Danielbrdz/Barcenas-Tiny-1.1b-DPO