ColPhi3.5
This model was trained from scratch on the data_dir/colpali_train_set dataset.
Model description
ColPhi3.5 is a model based on a novel model architecture and training strategy based on Vision Language Models (VLMs) to efficiently index documents from their visual features. It is a Phi3.5-V-4B extension that generates ColBERT- style multi-vector representations of text and images. It was introduced in the paper ColPali: Efficient Document Retrieval with Vision Language Models.
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for yydxlv/colphi3.5
Base model
microsoft/Phi-3.5-vision-instruct