BEIR-NL: Zero-shot Information Retrieval Benchmark for the Dutch Language Paper • 2412.08329 • Published Dec 11, 2024 • 1
BEIR-NL Collection Zero-shot Information Retrieval Benchmark for the Dutch Language • 16 items • Updated Feb 10 • 1
Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP Paper • 2408.04303 • Published Aug 8, 2024 • 20
facebook/dinov2-with-registers-small-imagenet1k-1-layer Image Classification • Updated Dec 23, 2024 • 283 • 1
ciCic/paraphrase-multilingual-MiniLM-L12-v2-sts-2d-matryoshka Sentence Similarity • Updated Oct 12, 2024 • 28 • 1
view article Article How to generate text: using different decoding methods for language generation with Transformers Mar 1, 2020 • 170
Addition is All You Need for Energy-efficient Language Models Paper • 2410.00907 • Published Oct 1, 2024 • 146
Parallel Sentences Datasets Collection These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual. • 14 items • Updated 17 days ago • 15