RichardErkhov commited on
Commit
08969db
1 Parent(s): b911713

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +430 -0
README.md ADDED
@@ -0,0 +1,430 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ LLaMAntino-3-ANITA-8B-Inst-DPO-ITA - GGUF
11
+ - Model creator: https://huggingface.co/swap-uniba/
12
+ - Original model: https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA/
13
+
14
+
15
+ | Name | Quant method | Size |
16
+ | ---- | ---- | ---- |
17
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q2_K.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q2_K.gguf) | Q2_K | 2.96GB |
18
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ3_XS.gguf) | IQ3_XS | 3.28GB |
19
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ3_S.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ3_S.gguf) | IQ3_S | 3.43GB |
20
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K_S.gguf) | Q3_K_S | 3.41GB |
21
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ3_M.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ3_M.gguf) | IQ3_M | 3.52GB |
22
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K.gguf) | Q3_K | 3.74GB |
23
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K_M.gguf) | Q3_K_M | 3.74GB |
24
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q3_K_L.gguf) | Q3_K_L | 4.03GB |
25
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ4_XS.gguf) | IQ4_XS | 4.18GB |
26
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_0.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_0.gguf) | Q4_0 | 4.34GB |
27
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.IQ4_NL.gguf) | IQ4_NL | 4.38GB |
28
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K_S.gguf) | Q4_K_S | 4.37GB |
29
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K.gguf) | Q4_K | 4.58GB |
30
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_K_M.gguf) | Q4_K_M | 4.58GB |
31
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_1.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q4_1.gguf) | Q4_1 | 4.78GB |
32
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_0.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_0.gguf) | Q5_0 | 5.21GB |
33
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_K_S.gguf) | Q5_K_S | 5.21GB |
34
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_K.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_K.gguf) | Q5_K | 5.34GB |
35
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_K_M.gguf) | Q5_K_M | 5.34GB |
36
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_1.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q5_1.gguf) | Q5_1 | 5.65GB |
37
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q6_K.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q6_K.gguf) | Q6_K | 6.14GB |
38
+ | [LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q8_0.gguf](https://huggingface.co/RichardErkhov/swap-uniba_-_LLaMAntino-3-ANITA-8B-Inst-DPO-ITA-gguf/blob/main/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA.Q8_0.gguf) | Q8_0 | 7.95GB |
39
+
40
+
41
+
42
+
43
+ Original model description:
44
+ ---
45
+ language:
46
+ - en
47
+ - it
48
+ license: llama3
49
+ library_name: transformers
50
+ tags:
51
+ - facebook
52
+ - meta
53
+ - pythorch
54
+ - llama
55
+ - llama-3
56
+ - llamantino
57
+ base_model: meta-llama/Meta-Llama-3-8B-Instruct
58
+ datasets:
59
+ - gsarti/clean_mc4_it
60
+ - Chat-Error/wizard_alpaca_dolly_orca
61
+ - mlabonne/orpo-dpo-mix-40k
62
+ metrics:
63
+ - accuracy
64
+ model_creator: Marco Polignano - SWAP Research Group
65
+ pipeline_tag: text-generation
66
+ model-index:
67
+ - name: LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
68
+ results:
69
+ - task:
70
+ type: text-generation
71
+ name: Text Generation
72
+ dataset:
73
+ name: AI2 Reasoning Challenge (25-Shot)
74
+ type: ai2_arc
75
+ config: ARC-Challenge
76
+ split: test
77
+ args:
78
+ num_few_shot: 25
79
+ metrics:
80
+ - type: acc_norm
81
+ value: 74.57
82
+ name: normalized accuracy
83
+ source:
84
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
85
+ name: Open LLM Leaderboard
86
+ - task:
87
+ type: text-generation
88
+ name: Text Generation
89
+ dataset:
90
+ name: HellaSwag (10-Shot)
91
+ type: hellaswag
92
+ split: validation
93
+ args:
94
+ num_few_shot: 10
95
+ metrics:
96
+ - type: acc_norm
97
+ value: 92.75
98
+ name: normalized accuracy
99
+ source:
100
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
101
+ name: Open LLM Leaderboard
102
+ - task:
103
+ type: text-generation
104
+ name: Text Generation
105
+ dataset:
106
+ name: MMLU (5-Shot)
107
+ type: cais/mmlu
108
+ config: all
109
+ split: test
110
+ args:
111
+ num_few_shot: 5
112
+ metrics:
113
+ - type: acc
114
+ value: 66.85
115
+ name: accuracy
116
+ source:
117
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
118
+ name: Open LLM Leaderboard
119
+ - task:
120
+ type: text-generation
121
+ name: Text Generation
122
+ dataset:
123
+ name: TruthfulQA (0-shot)
124
+ type: truthful_qa
125
+ config: multiple_choice
126
+ split: validation
127
+ args:
128
+ num_few_shot: 0
129
+ metrics:
130
+ - type: mc2
131
+ value: 75.93
132
+ source:
133
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
134
+ name: Open LLM Leaderboard
135
+ - task:
136
+ type: text-generation
137
+ name: Text Generation
138
+ dataset:
139
+ name: Winogrande (5-shot)
140
+ type: winogrande
141
+ config: winogrande_xl
142
+ split: validation
143
+ args:
144
+ num_few_shot: 5
145
+ metrics:
146
+ - type: acc
147
+ value: 82.0
148
+ name: accuracy
149
+ source:
150
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
151
+ name: Open LLM Leaderboard
152
+ - task:
153
+ type: text-generation
154
+ name: Text Generation
155
+ dataset:
156
+ name: GSM8k (5-shot)
157
+ type: gsm8k
158
+ config: main
159
+ split: test
160
+ args:
161
+ num_few_shot: 5
162
+ metrics:
163
+ - type: acc
164
+ value: 58.61
165
+ name: accuracy
166
+ source:
167
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA
168
+ name: Open LLM Leaderboard
169
+ ---
170
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/5df8bb21da6d0311fd3d540f/cZoZdwQOPdQsnQmDXHcSn.png" alt="llamantino3_anita" border="0" width="800px">
171
+
172
+ <hr>
173
+ <!--<img src="https://i.ibb.co/6mHSRm3/llamantino53.jpg" width="200"/>-->
174
+ <h3><i>"Built with <b>Meta Llama 3</b>".</i></i></h3>
175
+ <p style="text-align:justify;"><b>LLaMAntino-3-ANITA-8B-Inst-DPO-ITA</b> is a model of the <a href="https://huggingface.co/swap-uniba"><b>LLaMAntino</b></a> - <i>Large Language Models family</i>.
176
+ The model is an instruction-tuned version of <a href="https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct"><b>Meta-Llama-3-8b-instruct</b></a> (a fine-tuned <b>LLaMA 3 model</b>).
177
+ This model version aims to be the a <b>Multilingual Model</b> 🏁 (EN 🇺🇸 + ITA🇮🇹) to further fine-tuning on Specific Tasks in Italian.</p>
178
+
179
+
180
+ The 🌟**ANITA project**🌟 *(**A**dvanced **N**atural-based interaction for the **ITA**lian language)*
181
+ wants to provide Italian NLP researchers with an improved model for the Italian Language 🇮🇹 use cases.<br>
182
+
183
+ <hr>
184
+
185
+ **Live DEMO:** [https://chat.llamantino.it/](https://chat.llamantino.it/)<br>
186
+ *It works only with Italian connection.*
187
+
188
+ <hr>
189
+
190
+ ## Model Details
191
+ *Last Update: 10/05/2024*<br>
192
+
193
+ <a href="https://github.com/marcopoli/LLaMAntino-3-ANITA"><img src="https://github.githubassets.com/assets/GitHub-Logo-ee398b662d42.png" width="150"> https://github.com/marcopoli/LLaMAntino-3-ANITA</a><br>
194
+
195
+ | Model | HF | GGUF | EXL2 |
196
+ |-------|-------|-------|-------|
197
+ | *swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA* | [Link](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA) | [Link](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_GGUF) | [Link](https://huggingface.co/swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA_EXL2) |
198
+
199
+ <hr>
200
+
201
+ ## Specifications
202
+
203
+ - **Model developers**: <br><a href="https://marcopoli.github.io/">Ph.D. Marco Polignano</a> - University of Bari Aldo Moro, Italy <br> <a href="https://huggingface.co/swap-uniba">SWAP Research Group</a> <br>
204
+ - **Variations**: The model release has been **supervised fine-tuning (SFT)** using **QLoRA** 4bit, on instruction-based datasets. **DPO** approach over the *mlabonne/orpo-dpo-mix-40k* dataset is used to align with human preferences for helpfulness and safety.
205
+ - **Input**: Models input text only.
206
+ - **Language**: Multilingual 🏁 + Italian 🇮🇹
207
+ - **Output**: Models generate text and code only.
208
+ - **Model Architecture**: *Llama 3 architecture*.
209
+ - **Context length**: 8K, 8192.
210
+ - **Library Used**: [Unsloth](https://unsloth.ai/)
211
+ <hr>
212
+
213
+ ## Playground
214
+
215
+ To use the model directly, there are many ways to get started, choose one of the following ways to experience it.
216
+
217
+ ### Prompt Template
218
+ ```
219
+ <|start_header_id|>system<|end_header_id|>
220
+
221
+ { SYS Prompt }<|eot_id|><|start_header_id|>user<|end_header_id|>
222
+
223
+ { USER Prompt }<|eot_id|><|start_header_id|>assistant<|end_header_id|>
224
+
225
+ { ASSIST Prompt }<|eot_id|>
226
+ ````
227
+
228
+ ### Transformers
229
+
230
+ For direct use with `transformers`, you can easily get started with the following steps.
231
+
232
+ - Firstly, you need to install transformers via the command below with `pip`.
233
+
234
+ ```bash
235
+ pip install -U transformers trl peft accelerate bitsandbytes
236
+ ```
237
+
238
+ - Right now, you can start using the model directly.
239
+
240
+ ```python
241
+ import torch
242
+ from transformers import (
243
+ AutoModelForCausalLM,
244
+ AutoTokenizer,
245
+ )
246
+
247
+ base_model = "swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA"
248
+ model = AutoModelForCausalLM.from_pretrained(
249
+ base_model,
250
+ torch_dtype=torch.bfloat16,
251
+ device_map="auto",
252
+ )
253
+ tokenizer = AutoTokenizer.from_pretrained(base_model)
254
+
255
+ sys = "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA " \
256
+ "(Advanced Natural-based interaction for the ITAlian language)." \
257
+ " Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo."
258
+
259
+ messages = [
260
+ {"role": "system", "content": sys},
261
+ {"role": "user", "content": "Chi è Carlo Magno?"}
262
+ ]
263
+
264
+ #Method 1
265
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
266
+ inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
267
+ for k,v in inputs.items():
268
+ inputs[k] = v.cuda()
269
+ outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
270
+ results = tokenizer.batch_decode(outputs)[0]
271
+ print(results)
272
+
273
+ #Method 2
274
+ import transformers
275
+ pipe = transformers.pipeline(
276
+ model=model,
277
+ tokenizer=tokenizer,
278
+ return_full_text=False, # langchain expects the full text
279
+ task='text-generation',
280
+ max_new_tokens=512, # max number of tokens to generate in the output
281
+ temperature=0.6, #temperature for more or less creative answers
282
+ do_sample=True,
283
+ top_p=0.9,
284
+ )
285
+
286
+ sequences = pipe(messages)
287
+ for seq in sequences:
288
+ print(f"{seq['generated_text']}")
289
+
290
+ ```
291
+
292
+ - Additionally, you can also use a model with **4bit quantization** to reduce the required resources at least. You can start with the code below.
293
+
294
+ ```python
295
+ import torch
296
+ from transformers import (
297
+ AutoModelForCausalLM,
298
+ AutoTokenizer,
299
+ BitsAndBytesConfig,
300
+ )
301
+
302
+ base_model = "swap-uniba/LLaMAntino-3-ANITA-8B-Inst-DPO-ITA"
303
+ bnb_config = BitsAndBytesConfig(
304
+ load_in_4bit=True,
305
+ bnb_4bit_quant_type="nf4",
306
+ bnb_4bit_compute_dtype=torch.bfloat16,
307
+ bnb_4bit_use_double_quant=False,
308
+ )
309
+ model = AutoModelForCausalLM.from_pretrained(
310
+ base_model,
311
+ quantization_config=bnb_config,
312
+ device_map="auto",
313
+ )
314
+ tokenizer = AutoTokenizer.from_pretrained(base_model)
315
+
316
+ sys = "Sei un an assistente AI per la lingua Italiana di nome LLaMAntino-3 ANITA " \
317
+ "(Advanced Natural-based interaction for the ITAlian language)." \
318
+ " Rispondi nella lingua usata per la domanda in modo chiaro, semplice ed esaustivo."
319
+
320
+ messages = [
321
+ {"role": "system", "content": sys},
322
+ {"role": "user", "content": "Chi è Carlo Magno?"}
323
+ ]
324
+
325
+ #Method 1
326
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
327
+ inputs = tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
328
+ for k,v in inputs.items():
329
+ inputs[k] = v.cuda()
330
+ outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, top_p=0.9, temperature=0.6)
331
+ results = tokenizer.batch_decode(outputs)[0]
332
+ print(results)
333
+
334
+ #Method 2
335
+ import transformers
336
+ pipe = transformers.pipeline(
337
+ model=model,
338
+ tokenizer=tokenizer,
339
+ return_full_text=False, # langchain expects the full text
340
+ task='text-generation',
341
+ max_new_tokens=512, # max number of tokens to generate in the output
342
+ temperature=0.6, #temperature for more or less creative answers
343
+ do_sample=True,
344
+ top_p=0.9,
345
+ )
346
+
347
+ sequences = pipe(messages)
348
+ for seq in sequences:
349
+ print(f"{seq['generated_text']}")
350
+
351
+ ```
352
+
353
+ <hr>
354
+
355
+ ## Evaluation
356
+
357
+ **Open LLM Leaderboard:**
358
+
359
+ Evaluated with lm-evaluation-benchmark-harness for the [**Open Italian LLMs Leaderboard**](https://huggingface.co/spaces/FinancialSupport/open_ita_llm_leaderboard)
360
+ ```
361
+ lm_eval --model hf --model_args pretrained=HUGGINGFACE_MODEL_ID --tasks hellaswag_it,arc_it --device cuda:0 --batch_size auto:2
362
+ lm_eval --model hf --model_args pretrained=HUGGINGFACE_MODEL_ID --tasks m_mmlu_it --num_fewshot 5 --device cuda:0 --batch_size auto:2
363
+ ```
364
+
365
+ | Metric | Value |
366
+ |-----------------------|---------------------------|
367
+ | Avg. | **0.6160** |
368
+ | Arc_IT | 0.5714 |
369
+ | Hellaswag_IT | 0.7093 |
370
+ | MMLU_IT | 0.5672 |
371
+
372
+ <hr>
373
+
374
+ ## Unsloth
375
+
376
+ <img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/made with unsloth.png" width="200px" align="center" />
377
+
378
+ [Unsloth](https://unsloth.ai), a great tool that helps us easily develop products, at a lower cost than expected.
379
+
380
+ ## Citation instructions
381
+ ```bibtex
382
+ @misc{polignano2024advanced,
383
+ title={Advanced Natural-based interaction for the ITAlian language: LLaMAntino-3-ANITA},
384
+ author={Marco Polignano and Pierpaolo Basile and Giovanni Semeraro},
385
+ year={2024},
386
+ eprint={2405.07101},
387
+ archivePrefix={arXiv},
388
+ primaryClass={cs.CL}
389
+ }
390
+ ```
391
+
392
+ ```bibtex
393
+ @misc{basile2023llamantino,
394
+ title={LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language},
395
+ author={Pierpaolo Basile and Elio Musacchio and Marco Polignano and Lucia Siciliani and Giuseppe Fiameni and Giovanni Semeraro},
396
+ year={2023},
397
+ eprint={2312.09993},
398
+ archivePrefix={arXiv},
399
+ primaryClass={cs.CL}
400
+ }
401
+ ```
402
+
403
+ ```bibtex
404
+ @article{llama3modelcard,
405
+ title={Llama 3 Model Card},
406
+ author={AI@Meta},
407
+ year={2024},
408
+ url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
409
+ }
410
+ ```
411
+
412
+ # Acknowledgments
413
+ We acknowledge the support of the PNRR project [FAIR - Future AI Research (PE00000013)](https://fondazione-fair.it/en/foundation/), Spoke 6 - Symbiotic AI (CUP H97G22000210007) under the NRRP MUR program funded by the NextGenerationEU.
414
+ Models are built on the Leonardo supercomputer with the support of CINECA-Italian Super Computing Resource Allocation, class C project IscrC\_Pro\_MRS (HP10CQO70G).
415
+ <img src="https://wiki.u-gov.it/confluence/download/attachments/49842317/image2022-6-21_11-11-44.png?version=1&modificationDate=1655802705000&api=v2" width="600px">
416
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
417
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_swap-uniba__LLaMAntino-3-ANITA-8B-Inst-DPO-ITA)
418
+
419
+ | Metric |Value|
420
+ |---------------------------------|----:|
421
+ |Avg. |75.12|
422
+ |AI2 Reasoning Challenge (25-Shot)|74.57|
423
+ |HellaSwag (10-Shot) |92.75|
424
+ |MMLU (5-Shot) |66.85|
425
+ |TruthfulQA (0-shot) |75.93|
426
+ |Winogrande (5-shot) |82.00|
427
+ |GSM8k (5-shot) |58.61|
428
+
429
+
430
+