--- language: - en tags: - llama-2 pipeline_tag: text-classification model-index: - name: llama-2-13b-guanaco-fp16 results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 60.92 name: normalized accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=Mikael110/llama-2-13b-guanaco-fp16 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 83.18 name: normalized accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=Mikael110/llama-2-13b-guanaco-fp16 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 54.58 name: accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=Mikael110/llama-2-13b-guanaco-fp16 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 44.0 source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=Mikael110/llama-2-13b-guanaco-fp16 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 74.9 name: accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=Mikael110/llama-2-13b-guanaco-fp16 name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 11.6 name: accuracy source: url: https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard?query=Mikael110/llama-2-13b-guanaco-fp16 name: Open LLM Leaderboard --- This is a Llama-2 version of [Guanaco](https://huggingface.co./timdettmers/guanaco-13b). It was finetuned from the base [Llama-13b](https://huggingface.co./meta-llama/Llama-2-13b-hf) model using the official training scripts found in the [QLoRA repo](https://github.com/artidoro/qlora). I wanted it to be as faithful as possible and therefore changed nothing in the training script beyond the model it was pointing to. The model prompt is therefore also the same as the original Guanaco model. This repo contains the merged f16 model. The QLoRA adaptor can be found [here](https://huggingface.co./Mikael110/llama-2-13b-guanaco-qlora). A 7b version of the model can be found [here](https://huggingface.co./Mikael110/llama-2-7b-guanaco-fp16). **Legal Disclaimer: This model is bound by the usage restrictions of the original Llama-2 model. And comes with no warranty or gurantees of any kind.** # [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/details_Mikael110__llama-2-13b-guanaco-fp16) | Metric |Value| |---------------------------------|----:| |Avg. |54.86| |AI2 Reasoning Challenge (25-Shot)|60.92| |HellaSwag (10-Shot) |83.18| |MMLU (5-Shot) |54.58| |TruthfulQA (0-shot) |44.00| |Winogrande (5-shot) |74.90| |GSM8k (5-shot) |11.60|