Model Card for Model ID

Model Details

Model Description

  • Developed by: Agora Research
  • Model type: Vision Language Model
  • Language(s) (NLP): English/Chinese
  • Finetuned from model: Qwen-VL

Model Sources [optional]

Uses

import peft
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
from transformers.generation import GenerationConfig

Note: The default behavior now has injection attack prevention off.

tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen-VL",trust_remote_code=True)

model = AutoPeftModelForCausalLM.from_pretrained(
    "Qwen-VL-FNCall-qlora/", # path to the output directory
    device_map="cuda",
    fp16=True,
    trust_remote_code=True
).eval()

Specify hyperparameters for generation (generation_config if transformers < 4.32.0)

#model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-VL-Chat", trust_remote_code=True)


# 1st dialogue turn
query = tokenizer.from_list_format([
    {'image': 'https://images.rawpixel.com/image_800/cHJpdmF0ZS9sci9pbWFnZXMvd2Vic2l0ZS8yMDIzLTA4L3Jhd3BpeGVsX29mZmljZV8xNV9waG90b19vZl9hX2RvZ19ydW5uaW5nX3dpdGhfb3duZXJfYXRfcGFya19lcF9mM2I3MDQyZC0zNWJlLTRlMTQtOGZhNy1kY2Q2OWQ1YzQzZjlfMi5qcGc.jpg'}, # Either a local path or an url
    {'text': "[FUNCTION CALL]"},
])
print("sending model to chat")
response, history = model.chat(tokenizer, query=query, history=None)
print(response)

Print Results

[FUNCTION CALL]
{{
  'type': 'object',
  'properties': {{
    'puppy_colors': {{
      'type': 'array',
      'description': 'The colors of the puppies in the image.',
      'items': {{
        'type': 'string',
        'enum': ['golden']
      }}
    }},
    'puppy_posture': {{
      'type': 'string',
      'description': 'The posture of the puppies in the image.',
      'enum': ['sitting']
    }},
    'puppy_expression': {{
      'type': 'string',
      'description': 'The expression of the puppies in the image.',
      'enum': ['smiling']
    }},
    'puppy_location': {{
      'type': 'string',
      'description': 'The location of the puppies in the image.',
      'enum': ['on a green field with orange flowers']
    }},
    'puppy_background': {{
      'type': 'string',
      'description': 'The background of the puppies in the image.',
      'enum': ['green field with orange flowers']
    }}
  }}
}}

[EXPECTED OUTPUT]
{{
  'puppy_colors': ['golden'],
  'puppy_posture': 'sitting',
  'puppy_expression': 'smiling',
  'puppy_location': 'on a green field with orange flowers',
  'puppy_background': 'green field with orange flowers'
}}

Direct Use

Just send an image and put [FUNCTION CALL] in the text. Can also be used for normal qwenvl inference.

Recommendations

(recommended) transformers >= 4.32.0

How to Get Started with the Model

query = tokenizer.from_list_format([
    {'image': 'https://images.rawpixel.com/image_800/cHJpdmF0ZS9sci9pbWFnZXMvd2Vic2l0ZS8yMDIzLTA4L3Jhd3BpeGVsX29mZmljZV8xNV9waG90b19vZl9hX2RvZ19ydW5uaW5nX3dpdGhfb3duZXJfYXRfcGFya19lcF9mM2I3MDQyZC0zNWJlLTRlMTQtOGZhNy1kY2Q2OWQ1YzQzZjlfMi5qcGc.jpg'}, # Either a local path or an url
    {'text': "[FUNCTION CALL]"},
])

Training Details

Training Data

https://huggingface.co./datasets/AgoraX/OpenImage-FNCall-50k

Training Procedure

qlora for 1 epoch, 1000 steps

Framework versions

  • PEFT 0.7.1
Downloads last month
10
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Model tree for AgoraX/Qwen-VL-FNCall-qlora

Adapter
(2)
this model