Amir Hossein Kargaran's picture

Amir Hossein Kargaran

kargaranamir

·

https://kargaranamir.github.io

AI & ML interests

#NLP, checkout https://huggingface.co./cis-lmu

Recent Activity

liked a dataset 3 days ago

PaDaS-Lab/webfaq

liked a dataset 3 days ago

pszemraj/local-emoji-search-gte

liked a dataset 3 days ago

davanstrien/fineweb-c-all

View all activity

Organizations

kargaranamir's activity

upvoted a paper 12 days ago

On Relation-Specific Neurons in Large Language Models

Paper • 2502.17355 • Published 13 days ago • 6

upvoted a collection 17 days ago

MMTEB

Our contribution to the Massive Multilingual Text Embedding Benchmark (MMTEB). Retrieval and reranking benchmarks in 16 languages. • 4 items • Updated Jun 6, 2024 • 2

upvoted a paper 17 days ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published 19 days ago • 31

upvoted a collection 17 days ago

CommonCrawl

Large web-mined general corpus based on CommonCrawl. • 7 items • Updated Dec 8, 2024 • 2

upvoted a paper 22 days ago

NoLiMa: Long-Context Evaluation Beyond Literal Matching

Paper • 2502.05167 • Published about 1 month ago • 15

upvoted an article 29 days ago

Article

Open-R1: Update #1

By

and 7 others •

Feb 2

• 293

upvoted an article about 2 months ago

Article

Uncensor any LLM with abliteration

By

•

Jun 13, 2024

• 456

upvoted 2 articles 3 months ago

Article

Finding Moroccan Arabic (Darija) in Fineweb 2

By

and 3 others •

Dec 8, 2024

• 22

Article

They Said It Couldn’t Be Done

By

and 2 others •

Dec 5, 2024

• 82

upvoted 3 collections 3 months ago

LLM Training

46 items • Updated 11 days ago • 4

reading list

1 item • Updated Nov 4, 2024 • 1

Text Datasets

13 items • Updated 2 days ago • 1

upvoted a collection 4 months ago

OpenCoder

OpenCoder is an open and reproducible code LLM family which matches the performance of top-tier code LLMs. • 8 items • Updated Nov 23, 2024 • 80

upvoted a paper 4 months ago

How Transliterations Improve Crosslingual Alignment

Paper • 2409.17326 • Published Sep 25, 2024 • 1

upvoted 2 collections 4 months ago

LLMs

414 items • Updated 3 days ago • 30

cool datasets

154 items • Updated about 17 hours ago • 15

upvoted a paper 4 months ago

GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages

Paper • 2410.23825 • Published Oct 31, 2024 • 4

upvoted a collection 5 months ago

LLM Reasoning Papers

Papers to improve reasoning capabilities of LLMs • 20 items • Updated Jan 15 • 120

upvoted a paper 5 months ago

MEXA: Multilingual Evaluation of English-Centric LLMs via Cross-Lingual Alignment

Paper • 2410.05873 • Published Oct 8, 2024 • 3

upvoted a collection 9 months ago

LLM Spaces

189 items • Updated 5 days ago • 14