Product Captioining Model

Given a product-image, this model can create accurate description about the image, describing the following criterions:

Surface where object is located
Surrounding objects
Background
Lighting
Overall mood

The model was trained with a custom dataset tailored for this usecase.

Examples

Generated prompt: Professional photo of an object on a stone podium which is on a marble table, a wall in the background, a palm leaf in the corner, a harsh shadow from the left side, a concrete wall in the background, minimalist mood

Generated prompt: Professional photo of an object on a wooden table, bokeh background, soft daylight

Generated prompt: Professional photo of an object on a marble podium which is on a jungle clearing, surrounded by palm trees and lush greenery, a misty mountain range in the background, a cloudy sky

Uploaded model

Developed by: Vimax97
License: apache-2.0
Finetuned from model : unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit

This mllama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Vimax97
/

llama_3.2_vision_product_descriptor_v2

You need to agree to share your contact information to access this model

Product Captioining Model

Examples

Uploaded model