VideoChat2-TPO

This model is based on the paper Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment.

πŸƒ Installation

pip install -r requirements.txt
python app.py

πŸ”§ Usage

from transformers import AutoModel, AutoTokenizer
from tokenizer import MultimodalLlamaTokenizer

model_path = "OpenGVLab/VideoChat-TPO"
tokenizer =  AutoTokenizer.from_pretrained(model_path,
trust_remote_code=True,
use_fast=False,)
model = AutoModel.from_pretrained(model_path,  trust_remote_code=True, _tokenizer=self.tokenizer).eval()
Downloads last month
85
Safetensors
Model size
8.1B params
Tensor type
I64
Β·
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support model that require custom code execution.

Model tree for OpenGVLab/VideoChat-TPO

Finetuned
(920)
this model