unsloth/Llama-3.2-11B-Vision-Instruct (Fine-Tuned)

Model Overview

This model, fine-tuned from the unsloth/Llama-3.2-11B-Vision-Instruct base, is optimized for vision-language tasks with enhanced instruction-following capabilities. Fine-tuning was completed 2x faster using the Unsloth framework combined with Hugging Face's TRL library, ensuring efficient training while maintaining high performance.

Key Information

Developed by: Daemontatox
Base Model: unsloth/Llama-3.2-11B-Vision-Instruct
License: Apache-2.0
Language: English (en)
Frameworks Used: Hugging Face Transformers, Unsloth, and TRL

Performance and Use Cases

This model is ideal for applications involving:

Vision-based text generation and description tasks
Instruction-following in multimodal contexts
General-purpose text generation with enhanced reasoning

Features

2x Faster Training: Leveraging the Unsloth framework for accelerated fine-tuning.
Multimodal Capabilities: Enhanced to handle vision-language interactions.
Instruction Optimization: Tailored for improved comprehension and execution of instructions.

How to Use

Inference Example (Hugging Face Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Daemontatox/finetuned-llama-3.2-vision-instruct")
model = AutoModelForCausalLM.from_pretrained("Daemontatox/finetuned-llama-3.2-vision-instruct")

input_text = "Describe the image showing a sunset over mountains."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co./spaces/open-llm-leaderboard/open_llm_leaderboard)
Detailed results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/Daemontatox__DocumentCogito-details)!
Summarized results can be found [here](https://huggingface.co./datasets/open-llm-leaderboard/contents/viewer/default/train?q=Daemontatox%2FDocumentCogito&sort[column]=Average%20%E2%AC%86%EF%B8%8F&sort[direction]=desc)!

|      Metric       |Value (%)|
|-------------------|--------:|
|**Average**        |    24.21|
|IFEval (0-Shot)    |    50.64|
|BBH (3-Shot)       |    29.79|
|MATH Lvl 5 (4-Shot)|    16.24|
|GPQA (0-shot)      |     8.84|
|MuSR (0-shot)      |     8.60|
|MMLU-PRO (5-shot)  |    31.14|

Daemontatox
/

DocumentCogito