FLMR model card
FLMR is an open-source model for multimodal knowledge retrieval. It is a transformer-based model that uses a combination of text and image inputs to retrieve relevant documents from a large corpus.
Model Details
Model Description
- Model type: FLMRModelForRetrieval
- Language(s) (NLP): English
- License: MIT License
Paper and resources for more detail
- Blog Post for quick overview: https://www.jinghong-chen.net/fined-grained-late-interaction-multimodal-retrieval-flmr/
- Paper: https://openreview.net/forum?id=IWWWulAX7g
- Repository: https://github.com/LinWeizheDragon/FLMR
Uses
Direct Use
This model can be used directly to retrieve documents from a large corpus using a combination of text and image input queries. The retrieval usage can be found in the official implementation.
Downstream Use
This model can be used combined with language models to create a retrieval-augmented language model. The use for Knowledge-based VQA can be found in RAVQA
How to Get Started with the Model
For details of training, indexing, and performing retrieval, please refer to here.
Training datasets
The model is pre-trained on
- Image to Text retrieval: WIT
- Image & Question to Text retrieval: OKVQA
For details on the dataset split and conversion process, please refer to the paper Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering.
The processed datasets are:
- https://huggingface.co./datasets/BByrneLab/OKVQA_FLMR_preprocessed_data
- https://huggingface.co./datasets/BByrneLab/OKVQA_FLMR_preprocessed_GoogleSearch_passages
Evaluation datasets
The model is evaluated on OKVQA, Infoseek, and FVQA.
Please find the evaluation results in the paper.
Citation
BibTeX:
@inproceedings{
lin2023finegrained,
title={Fine-grained Late-interaction Multi-modal Retrieval for Retrieval Augmented Visual Question Answering},
author={Weizhe Lin and Jinghong Chen and Jingbiao Mei and Alexandru Coca and Bill Byrne},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=IWWWulAX7g}
}
- Downloads last month
- 19