Model Card: sigridjineth/ko-reranker-v1.1-preview

Note: This is a preview release. The model, which is finetuned from Alibaba-NLP/gte-multilingual-reranker-base is currently under development and may undergo further changes as we refine and improve its performance. Underwent A100 x 8 with 12 hours for training.

Training Data

This model is trained on sigridjineth/korean_nli_dataset_reranker_v0, which aggregates several publicly available datasets, ensuring rich linguistic diversity:

kor_nli (train): https://huggingface.co./datasets/kor_nli
mnli_ko (train): https://huggingface.co./datasets/kozistr/mnli_ko
ko-wiki-reranking (train): https://huggingface.co./datasets/upskyy/ko-wiki-reranking
mr_tydi_korean (train): https://huggingface.co./datasets/castorini/mr-tydi
klue_nli (train): https://huggingface.co./datasets/klue/klue

These combined resources ensure coverage across a wide range of topics, styles, and complexities in Korean language data, enabling the model to capture nuanced semantic differences.

Key Features

Hard Negative Mining:
Integrated BAAI/bge-m3 to mine challenging negatives. This approach sharpens the model’s ability to distinguish subtle contrasts, boosting robustness and improving ranking quality.
Teacher-Student Distillation:
Leveraged BAAI/bge-reranker-v2.5-gemma2-lightweight as a teacher model. The student reranker learned from teacher-provided positive/negative scores, accelerating convergence and achieving better final performance.

Intended Use

Search & Information Retrieval: Improve document ranking for Korean-language search queries.
Question Answering (QA): Enhance QA pipelines by reordering candidate answers for improved relevance.
Content Recommendation: Refine recommendation engines that rely on textual signals to deliver more accurate suggestions.

Limitations & Future Work

Preview Release:
The model is still in the refinement phase. Expect future updates to improve stability, generalization, and performance.
Need for Evaluation:
Developing and standardizing benchmarks for generalized Korean retrieval tasks (especially for rerankers) will be an ongoing effort.

Evaluation

The AutoRAG Benchmark serves as both the evaluation dataset and the toolkit for reporting these metrics.

Model: `sigridjineth/ko-reranker-v1.1-preview`

top_k	Execution Time	F1	Recall	Precision	MAP	MRR	NDCG	Is Best
1	0.0438	0.6754	0.6754	0.6754	0.6754	0.6754	0.6754	True
3	0.0486	0.3684	0.7368	0.2456	0.7032	0.7032	0.7119	False
5	0.0446	0.3684	0.7368	0.2456	0.7032	0.7032	0.7119	False

Model: `Alibaba-NLP/gte-multilingual-reranker-base`

top_k	Execution Time	F1	Recall	Precision	MAP	MRR	NDCG	Is Best
1	0.0481	0.6316	0.6316	0.6316	0.6316	0.6316	0.6316	True
3	0.0427	0.3596	0.7193	0.2398	0.6725	0.6725	0.6846	False
5	0.0442	0.3596	0.7193	0.2398	0.6725	0.6725	0.6846	False

Model: `dragonkue/bge-reranker-v2-m3-ko`

top_k	Execution Time	F1	Recall	Precision	MAP	MRR	NDCG	Is Best
1	0.0814	0.6930	0.6930	0.6930	0.6930	0.6930	0.6930	True
3	0.0813	0.3596	0.7193	0.2398	0.7061	0.7061	0.7096	False
5	0.0824	0.3596	0.7193	0.2398	0.7061	0.7061	0.7096	False

Usage (transformers>=4.36.0)

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_name_or_path = "sigridjineth/ko-reranker-v1.1-preview"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name_or_path, 
    trust_remote_code=True,
    torch_dtype=torch.float16
)
model.eval()

pairs = [
    ["중국의 수도는","베이징"], 
    ["2024년 대한민국 대통령은?", "대한민국 대통령은 윤석열이다"], 
    ["파이썬에서 퀵 소트를 구현하기","quick sort로 코테 1등 먹어보자"]
]

with torch.no_grad():
    inputs = tokenizer(pairs, padding=True, truncation=True, return_tensors='pt', max_length=512)
    scores = model(**inputs, return_dict=True).logits.view(-1, ).float()
    print(scores)
# Example output:
# tensor([1.2315, 0.5923, 0.3041])

Usage with Infinity

Infinity is an MIT-licensed inference REST API server that can easily host and serve models. For instance:

docker run --gpus all -v $PWD/data:/app/.cache -p "7997":"7997" \
michaelf34/infinity:0.0.68 \
v2 --model-id Alibaba-NLP/gte-multilingual-reranker-base --revision "main" \
--dtype bfloat16 --batch-size 32 --device cuda --engine torch --port 7997

References

@misc{zhang2024mgtegeneralizedlongcontexttext,
  title={mGTE: Generalized Long-Context Text Representation and Reranking Models for Multilingual Text Retrieval}, 
  author={Xin Zhang and Yanzhao Zhang and Dingkun Long and Wen Xie and Ziqi Dai and Jialong Tang and Huan Lin and Baosong Yang and Pengjun Xie and Fei Huang and Meishan Zhang and Wenjie Li and Min Zhang},
  year={2024},
  eprint={2407.19669},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2407.19669}, 
}

@misc{li2023making,
  title={Making Large Language Models A Better Foundation For Dense Retrieval}, 
  author={Chaofan Li and Zheng Liu and Shitao Xiao and Yingxia Shao},
  year={2023},
  eprint={2312.15503},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

@misc{chen2024bge,
  title={BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation}, 
  author={Jianlv Chen and Shitao Xiao and Peitian Zhang and Kun Luo and Defu Lian and Zheng Liu},
  year={2024},
  eprint={2402.03216},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}

sigridjineth
/

ko-reranker-v1.1

Model Card: sigridjineth/ko-reranker-v1.1-preview

Training Data

Key Features

Intended Use

Limitations & Future Work

Evaluation

Model: `sigridjineth/ko-reranker-v1.1-preview`

Model: `Alibaba-NLP/gte-multilingual-reranker-base`

Model: `dragonkue/bge-reranker-v2-m3-ko`

Usage (transformers>=4.36.0)

Usage with Infinity

References

Model tree for sigridjineth/ko-reranker-v1.1

Dataset used to train sigridjineth/ko-reranker-v1.1

Model Card: sigridjineth/ko-reranker-v1.1-preview

Training Data

Key Features

Intended Use

Limitations & Future Work

Evaluation

Model: sigridjineth/ko-reranker-v1.1-preview

Model: Alibaba-NLP/gte-multilingual-reranker-base

Model: dragonkue/bge-reranker-v2-m3-ko

Usage (transformers>=4.36.0)

Usage with Infinity

References

Model tree for sigridjineth/ko-reranker-v1.1

Dataset used to train sigridjineth/ko-reranker-v1.1

Model: `sigridjineth/ko-reranker-v1.1-preview`

Model: `Alibaba-NLP/gte-multilingual-reranker-base`

Model: `dragonkue/bge-reranker-v2-m3-ko`