Text Generation
Transformers
Safetensors
English
llava_llama
Inference Endpoints
HaoyeZhang commited on
Commit
90e8e73
Β·
verified Β·
1 Parent(s): 8105cb9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -4
README.md CHANGED
@@ -20,11 +20,12 @@ RLAIF-V maximally exploits the open-source feedback from two key perspectives, i
20
  ### Key Features
21
 
22
  * πŸ“ˆ **Most trustworthy LLaVA 1.5**: By learning from open-source AI feedback, specifically, the feedback from LLaVA-NeXT-34B, RLAIF-V-7B achieves the best trustworthiness improvement on LLaVA-v1.5 compared to other hallucination reduction methods.
23
- * πŸ’ͺ **Maintaining Well Performance on General Abilities**: On benchmarks evaluating general capabilities (e.g. LLaVA Bench, MMStar), RLAIF-V-7B also exhibits good performance.
 
24
 
25
 
26
  <p align="center">
27
- <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/ypXZxb4HE-jDPJU9115bi.png" alt="fig1" width="90%"/>
28
  </p>
29
 
30
  ### Examples
@@ -54,8 +55,8 @@ If you find our model/code/paper helpful, please consider cite our papers πŸ“:
54
  }
55
 
56
  @article{yu2024rlaifv,
57
- title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness},
58
- author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong},
59
  journal={arXiv preprint arXiv:2405.17220},
60
  year={2024},
61
  }
 
20
  ### Key Features
21
 
22
  * πŸ“ˆ **Most trustworthy LLaVA 1.5**: By learning from open-source AI feedback, specifically, the feedback from LLaVA-NeXT-34B, RLAIF-V-7B achieves the best trustworthiness improvement on LLaVA-v1.5 compared to other hallucination reduction methods.
23
+ * πŸ’ͺ **Maintaining Well Performance on General Abilities**: On benchmarks evaluating general capabilities (e.g. MMStar), RLAIF-V-7B also exhibits good performance.
24
+ * πŸš€ **Inference-time Scaling by Self-guidance**: Using RLAIF-V 7B as a reward model can further improve model performance on multiple benchmarks with best-of-N selection.
25
 
26
 
27
  <p align="center">
28
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/6566e0c493e30c8a60048eb3/dhsi5_okbtlBp2pfYOkFK.png" alt="fig1" width="90%"/>
29
  </p>
30
 
31
  ### Examples
 
55
  }
56
 
57
  @article{yu2024rlaifv,
58
+ title={RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness},
59
+ author={Tianyu Yu and Haoye Zhang and Qiming Li and Qixin Xu and Yuan Yao and Da Chen and Xiaoman Lu and Ganqu Cui and Yunkai Dang and Taiwen He and Xiaocheng Feng and Jun Song and Bo Zheng and Zhiyuan Liu and Tat-Seng Chua and Maosong Sun},
60
  journal={arXiv preprint arXiv:2405.17220},
61
  year={2024},
62
  }