Model Description

RoBERTA ReRanker for Retrieved Results or R* (pronounced R-star) is an advanced model designed to enhance search results' relevance and accuracy through reranking. By integrating the retrieval capabilities of R* with generative models, this hybrid approach significantly enhances the relevance and contextual depth of search results. Based on the RoBERTa tiny architecture, R* is specialized in distinguishing relevant from irrelevant query-passage pairs, thereby refining the output of LLMs in retrieval and generative tasks. This model is an experiment featured and presented in PACLIC 38 (2024), which would be published in the ACL Anthology.

Training Data

R* was trained on a dataset derived from the MS MARCO passage ranking dataset, consisting of 2.5 million query-positive passage pairs and an equal number of query-negative passage pairs, totaling 5 million query-passage pairs. This ensures a balanced training approach, exposing R* to both relevant and irrelevant examples equally.

Training Procedure

Training focused on binary classification, aiming to assign a continuous relevance score ranging from 0 (irrelevant) to 1 (relevant) for each query-passage pair. The model underwent training for 7 epochs with a batch size of 2048, utilizing a Colab Pro instance equipped with a V100 GPU (16 GB VRAM) and 51 GB RAM, completing in approximately 16 hours.

Evaluation and Performance

Coming soon.

Use Cases

R* is particularly suitable for applications that demand high precision in information retrieval, such as RAG reranking, search engine results, document searching in legal or academic databases, recommendation systems, and beyond.

How to Use

With Transformers

For usage with the Transformers library, you can follow this generic example:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained('jaspercatapang/R-star')
tokenizer = AutoTokenizer.from_pretrained('jaspercatapang/R-star')

features = tokenizer(['Your query here', 'First relevant passage for first query'], ['Your query here', 'Second relevant passage for second query'], padding=True, truncation=True, return_tensors="pt")

model.eval()
with torch.no_grad():
    scores = model(**features).logits
    print(scores)

With SentenceTransformers

from sentence_transformers import CrossEncoder

model = CrossEncoder('jaspercatapang/R-star', max_length=512)
scores = model.predict([('Your query here', 'First relevant passage for first query'), ('Your query here', 'Second relevant passage for second query')])

Training and Evaluation

For training, the Colab notebook can be found here.
For evaluation, the Colab notebook can be found here.

Limitations

Based on our evaluation, R* tends to favor longer passages when scoring, which could introduce a bias. This is true for most cross-encoder models. It is advisable to preprocess text to normalize passage lengths for fair comparison. Note that R* is optimized for passage-level comparisons and may not perform well on word- or phrase-level similarity tasks.

Ethical Considerations

The use of R* introduces several ethical considerations, including potential biases in the training data, privacy concerns, and the implications of automating decision-making processes. Users are encouraged to critically evaluate the model's fairness and transparency, ensuring its equitable use across diverse demographics.

Contact Details

For additional information or inquiries about R*, please contact the developer via [email protected]

Disclaimer

R* is an AI language model developed by Jasper Kyle Catapang. It is provided "as is" without warranty of any kind, expressed or implied. The model developer shall not be liable for any direct or indirect damages arising from the use of this model.

Acknowledgments

Thank you to Microsoft for the MS MARCO dataset. We would also like to extend our gratitude to Haisong Zhang for the base model.