Conan-embedding-v1

Performance

Model	Average	CLS	Clustering	Reranking	Retrieval	STS	Pair_CLS
gte-Qwen2-7B-instruct	72.05	75.09	66.06	68.92	76.03	65.33	87.48
xiaobu-embedding-v2	72.43	74.67	65.17	72.58	76.5	64.53	91.87
Conan-embedding-v1	72.62	75.03	66.33	72.76	76.67	64.18	91.66

Methods and Training Detials

Please refer to our technical report.

Citation

If you find our models / papers useful in your research, please consider giving ❤️ and citations. Thanks!

@misc{li2024conanembeddinggeneraltextembedding,
  title={Conan-embedding: General Text Embedding with More and Better Negative Samples}, 
  author={Shiyu Li and Yang Tang and Shizhe Chen and Xi Chen},
  year={2024},
  eprint={2408.15710},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2408.15710}, 
}

About

License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Model tree for TencentBAC/Conan-embedding-v1

Quantizations

2 models

Evaluation results

cos_sim_pearson on MTEB AFQMC
validation set self-reported

56.614
cos_sim_spearman on MTEB AFQMC
validation set self-reported

60.664
euclidean_pearson on MTEB AFQMC
validation set self-reported

58.421
euclidean_spearman on MTEB AFQMC
validation set self-reported

59.828
manhattan_pearson on MTEB AFQMC
validation set self-reported

58.399
manhattan_spearman on MTEB AFQMC
validation set self-reported

59.818
cos_sim_pearson on MTEB ATEC
test set self-reported

56.605
cos_sim_spearman on MTEB ATEC
test set self-reported

58.638
euclidean_pearson on MTEB ATEC
test set self-reported

62.186
euclidean_spearman on MTEB ATEC
test set self-reported

58.232

View on Papers With Code