Product Captioining Model
Given a product-image, this model can create accurate description about the image, describing the following criterions:
- Surface where object is located
- Surrounding objects
- Background
- Lighting
- Overall mood
The model was trained with a custom dataset tailored for this usecase.
Examples
Generated prompt: Professional photo of an object on a stone podium which is on a marble table, a wall in the background, a palm leaf in the corner, a harsh shadow from the left side, a concrete wall in the background, minimalist mood
Generated prompt: Professional photo of an object on a wooden table, bokeh background, soft daylight
Generated prompt: Professional photo of an object on a marble podium which is on a jungle clearing, surrounded by palm trees and lush greenery, a misty mountain range in the background, a cloudy sky
Uploaded model
- Developed by: Vimax97
- License: apache-2.0
- Finetuned from model : unsloth/llama-3.2-11b-vision-instruct-unsloth-bnb-4bit
This mllama model was trained 2x faster with Unsloth and Huggingface's TRL library.