Finetuning script for Phi3.5-Vision

#24

by 2U1 - opened Sep 11

2U1

Sep 11

https://github.com/2U1/Phi3-Vision-Finetune

I made a fine-tuning script for Phi3.5-Vision. It supprots single-image, multi-image and video dataset.
You can select each module (Vision, LLM, Projector) for fine tuning and set different learning rate for all.

Feedback and issues are welcome!

adnanPBI

Oct 30

can you share the fine-tuning script?? I want to train this model to recognize text, emojis as well as corresponding layout.

2U1

Oct 30

@adnanPBI You can visit the repo and use it !

dutta18

9 days ago

Can it be used for VQA tasks ?

2U1

8 days ago

@dutta18 Yes it could be use for VQA tasks.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment