arxiv:2401.08417

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Published on Jan 16

· Submitted by

akhaliq on Jan 17

#3 Paper of the day

Upvote

Authors:

Haoran Xu ,

Amr Sharaf ,

Yunmo Chen ,

Kenton Murray ,

Young Jin Kim

Abstract

Moderate-sized large language models (LLMs) -- those with 7B or 13B parameters -- exhibit promising machine translation (MT) performance. However, even the top-performing 13B LLM-based translation models, like ALMA, does not match the performance of state-of-the-art conventional encoder-decoder translation models or larger-scale LLMs such as GPT-4. In this study, we bridge this performance gap. We first assess the shortcomings of supervised fine-tuning for LLMs in the MT task, emphasizing the quality issues present in the reference data, despite being human-generated. Then, in contrast to SFT which mimics reference translations, we introduce Contrastive Preference Optimization (CPO), a novel approach that trains models to avoid generating adequate but not perfect translations. Applying CPO to ALMA models with only 22K parallel sentences and 12M parameters yields significant improvements. The resulting model, called ALMA-R, can match or exceed the performance of the WMT competition winners and GPT-4 on WMT'21, WMT'22 and WMT'23 test datasets.

View arXiv page View PDF Add to collection

Community

derek-thomas

Jan 22

This looks fascinating. I love that you have the models on the hub. Amazing work @haoranxu and everyone else!

https://huggingface.co./haoranxu/ALMA-13B-R

librarian-bot

Jan 22

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

blanchon

Jun 9

New Method Beats GPT-4 in Machine Translation: Introducing Contrastive Preference Optimization

Links 🔗:

👉 Subscribe: https://www.youtube.com/@Arxflix
👉 Twitter: https://x.com/arxflix
👉 LMNT (Partner): https://lmnt.com/

By Arxflix

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 33

Browse 33 models citing this paper

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

Abstract

Community

New Method Beats GPT-4 in Machine Translation: Introducing Contrastive Preference Optimization

Links 🔗:

Models citing this paper 33

Datasets citing this paper 2

Spaces citing this paper 9

Collections including this paper 8