openbmb
/

RLAIF-V-7B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

HaoyeZhang commited on 10 days ago

Commit

90e8e73

·

verified ·

1 Parent(s): 8105cb9

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -20,11 +20,12 @@ RLAIF-V maximally exploits the open-source feedback from two key perspectives, i
 ### Key Features
 * 📈 **Most trustworthy LLaVA 1.5**: By learning from open-source AI feedback, specifically, the feedback from LLaVA-NeXT-34B, RLAIF-V-7B achieves the best trustworthiness improvement on LLaVA-v1.5 compared to other hallucination reduction methods.
-* 💪 **Maintaining Well Performance on General Abilities**: On benchmarks evaluating general capabilities (e.g. LLaVA Bench, MMStar), RLAIF-V-7B also exhibits good performance.
 <p align="center">
-  <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/ypXZxb4HE-jDPJU9115bi.png" alt="fig1" width="90%"/>
 </p>
 ### Examples
@@ -54,8 +55,8 @@ If you find our model/code/paper helpful, please consider cite our papers 📝:
 }
 @article{yu2024rlaifv,
-  title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness},
-  author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong},
   journal={arXiv preprint arXiv:2405.17220},
   year={2024},
 }

 ### Key Features
 * 📈 **Most trustworthy LLaVA 1.5**: By learning from open-source AI feedback, specifically, the feedback from LLaVA-NeXT-34B, RLAIF-V-7B achieves the best trustworthiness improvement on LLaVA-v1.5 compared to other hallucination reduction methods.
+* 💪 **Maintaining Well Performance on General Abilities**: On benchmarks evaluating general capabilities (e.g. MMStar), RLAIF-V-7B also exhibits good performance.
+* 🚀 **Inference-time Scaling by Self-guidance**: Using RLAIF-V 7B as a reward model can further improve model performance on multiple benchmarks with best-of-N selection.
 <p align="center">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/dhsi5_okbtlBp2pfYOkFK.png" alt="fig1" width="90%"/>
 </p>
 ### Examples
 }
 @article{yu2024rlaifv,
+  title={RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness},
+  author={Tianyu Yu and Haoye Zhang and Qiming Li and Qixin Xu and Yuan Yao and Da Chen and Xiaoman Lu and Ganqu Cui and Yunkai Dang and Taiwen He and Xiaocheng Feng and Jun Song and Bo Zheng and Zhiyuan Liu and Tat-Seng Chua and Maosong Sun},
   journal={arXiv preprint arXiv:2405.17220},
   year={2024},
 }