"Built with Meta Llama 3".
LLaMAntino-3-ANITA-8B-Inst-DPO-ITA is a model of the LLaMAntino - Large Language Models family. The model is an instruction-tuned version of Meta-Llama-3-8b-instruct (a fine-tuned LLaMA 3 model). This model version aims to be the a Multilingual Model ๐ (EN ๐บ๐ธ + ITA๐ฎ๐น) to further fine-tuning on Specific Tasks in Italian.
The ๐ANITA project๐ *(Advanced Natural-based interaction for the ITAlian language)* wants to provide Italian NLP researchers with an improved model for the Italian Language ๐ฎ๐น use cases.
Model Details
https://github.com/marcopoli/LLaMAntino-3-ANITA
- Full Model: LaMAntino-3-ANITA-8B-Inst-DPO-ITA
- ExLlamaV2 - 3.0bpw model
- ExLlamaV2 - 4.0bpw model
- ExLlamaV2 - 4.5bpw model
- ExLlamaV2 - measurement.json
Specifications
- Model developers:
Ph.D. Marco Polignano - University of Bari Aldo Moro, Italy
SWAP Research Group - Variations: The model release has been supervised fine-tuning (SFT) using QLoRA 4bit, on instruction-based datasets. DPO approach over the mlabonne/orpo-dpo-mix-40k dataset is used to align with human preferences for helpfulness and safety.
- Input: Models input text only.
- Language: Multilingual ๐ + Italian ๐ฎ๐น
- Output: Models generate text and code only.
- Model Architecture: Llama 3 architecture.
- Context length: 8K, 8192.
- Library Used: LLaMA.cpp
Prompt Template
<|start_header_id|>system<|end_header_id|>
{ SYS Prompt }<|eot_id|><|start_header_id|>user<|end_header_id|>
{ USER Prompt }<|eot_id|><|start_header_id|>assistant<|end_header_id|>
{ ASSIST Prompt }<|eot_id|>
ExLlamaV2
ExLlamaV2, a great tool that helps us easily Quantize your model in EXL2 format.
Citation instructions
@misc{polignano2024advanced,
title={Advanced Natural-based interaction for the ITAlian language: LLaMAntino-3-ANITA},
author={Marco Polignano and Pierpaolo Basile and Giovanni Semeraro},
year={2024},
eprint={2405.07101},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@misc{basile2023llamantino,
title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language},
author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
year={2023},
eprint={2312.09993},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@article{llama3modelcard,
title={Llama 3 Model Card},
author={AI@Meta},
year={2024},
url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_EXL2
Base model
meta-llama/Meta-Llama-3-8B-Instruct