Arabic Matryoshka Embedding Models

Omartificial-Intelligence-Space 's Collections

Arabic ModernBERT Collection

Huggingface FineWeb2 Arabic Dataset Portions

Arabic Re-Ranking Hub

Arabic LLAMA3 & 3.1 FineTuned Models

Arabic NLI & Semantic Similarity Datasets

ArabianLLM Series

GATE: General Arabic Text Embedding Models

updated Dec 4, 2024

A collection of advanced Arabic Matryoshka Embedding Models designed for efficient and high-performance Arabic NLP, available publicly on Hugging Face

Upvote

Running on Zero

3

🌍

Matroyshka Eval Retrieval Ar
Enhancing Semantic Similarity Understanding in Arabic NLP with Nested Embedding Learning

Paper • 2407.21139 • Published Jul 30, 2024 • 3
Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2

Sentence Similarity • Updated 7 days ago • 2.2k • 10

Note Note This is a Nested Embedding Model using Matryoshka Learning , achieving the 1st on the MTEB leaderboard with STS17 85.3 . It demonstrates exceptional efficiency in various NLP tasks, including precise semantic similarity and textual entailment in Arabic.
Omartificial-Intelligence-Space/Arabic-mpnet-base-all-nli-triplet

Sentence Similarity • Updated 7 days ago • 1.11k • 10

Note This model is an English fine-tuned version derived from the "tomaarsen/mpnet-base-all-nli-triplet", which itself is originally based on "microsoft/mpnet-base". Despite being primarily trained on English data and having seen only a few Arabic tokens, this model has demonstrated impressive performance in Arabic NLP tasks. After fine-tuning, it achieved a notable score of 79.9 on the STS17 MTEB leaderboard.
Omartificial-Intelligence-Space/Arabic-all-nli-triplet-Matryoshka

Sentence Similarity • Updated 7 days ago • 546 • 2

Note This model is fine-tuned from the "sentence-transformers/paraphrase-multilingual-mpnet-base-v2". It has been specifically adapted to handle Arabic NLP tasks, making it a powerful tool for understanding and processing Arabic text. On the MTEB STS17 leaderboard, it achieved an impressive score of 82.4. It is really powerful model for sentence similarty
Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka

Sentence Similarity • Updated 7 days ago • 1.84k • 10

Note This is a Nested Embedding Model using Matryoshka Learning , achieving a high score of 83.16 on the STS17 leaderboard. It demonstrates exceptional efficiency in various NLP tasks, including precise semantic similarity and textual entailment in Arabic.
Omartificial-Intelligence-Space/Arabic-labse-Matryoshka

Sentence Similarity • Updated 20 days ago • 508 • 2

Note This sentence-transformers model, fine-tuned from sentence-transformers/LaBSE, has secured the second position on the STS17 MTEB leaderboard with a score of 82.47. It combines the strengths of LaBSE with the specific needs of Arabic language processing, making it a robust choice for tasks that require accurate semantic similarity and textual entailment in Arabic. This model is ideal for applications needing high performance and precision in understanding Arabic text.
Omartificial-Intelligence-Space/Marbert-all-nli-triplet-Matryoshka

Sentence Similarity • Updated 20 days ago • 499 • 1

Note This model, fine-tuned from the MarBERT base, has achieved the fourth position on the STS17 MTEB leaderboard with a score of 82.18. It leverages the MarBERT architecture, which is specifically designed for Arabic language processing, enhancing its performance through Matryoshka fine-tuning.
Omartificial-Intelligence-Space/Arabic-MiniLM-L12-v2-all-nli-triplet

Sentence Similarity • Updated 20 days ago • 522 • 4

Note This model, Omartificial-Intelligence-Space/Arabic-MiniLM-L12-v2-all-nli-triplet, fine-tuned from the sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 base, has achieved a commendable score of 81.11 on the STS17 MTEB leaderboard.
Omartificial-Intelligence-Space/E5-all-nli-triplet-Matryoshka

Sentence Similarity • Updated Dec 28, 2024 • 9 • 1

Upvote

Arabic Matryoshka Embedding Models

Matroyshka Eval Retrieval Ar