Sigrid Jin's picture

Sigrid Jin PRO

sigridjineth

·

https://sigridjin.medium.com

AI & ML interests

Newbie

Recent Activity

liked a model 2 days ago

sentence-transformers/static-similarity-mrl-multilingual-v1

liked a model 2 days ago

sentence-transformers/static-retrieval-mrl-en-v1

liked a dataset 2 days ago

HAERAE-HUB/HRM8K

View all activity

Organizations

sigridjineth's activity

upvoted 2 papers about 1 month ago

Benchmarking and Building Long-Context Retrieval Models with LoCo and M2-BERT

Paper • 2402.07440 • Published Feb 12, 2024 • 1

DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection

Paper • 2406.00856 • Published Jun 2, 2024 • 12

upvoted a collection about 1 month ago

NeMo Curator - Classifier Models

Classifier models that can be used in NeMo Curator for labelling/filtering datasets. • 9 items • Updated 11 days ago • 13

upvoted a paper about 1 month ago

Jina CLIP: Your CLIP Model Is Also Your Text Retriever

Paper • 2405.20204 • Published May 30, 2024 • 35

upvoted a collection about 1 month ago

jina-clip

Multimodal text-image embeddings • 4 items • Updated Dec 14, 2024 • 10

upvoted 2 collections about 2 months ago

MMTEB

Our contribution to the Massive Multilingual Text Embedding Benchmark (MMTEB). Retrieval and reranking benchmarks in 16 languages. • 4 items • Updated Jun 6, 2024 • 1

Arctic-embed

A collection of text embedding models optimized for retrieval accuracy and efficiency • 8 items • Updated Dec 5, 2024 • 17

upvoted a collection 2 months ago

ColPali Models

Pre-trained checkpoints for the ColPali model. • 8 items • Updated 5 days ago • 3

upvoted a collection 4 months ago

Small LMs Text Embedding

Contrastive fine-tuned version of Language Models up to 2B parameters using LoRA • 3 items • Updated May 8, 2024 • 4

upvoted a collection 5 months ago

Papers I want to read

Papers in my to-read list • 259 items • Updated 18 days ago • 30

upvoted 4 collections 6 months ago

Korean Pretraining Dataset

15 items • Updated Nov 19, 2024 • 11

Matryoshka Embedding Models

https://huggingface.co./blog/matryoshka • 14 items • Updated Jun 4, 2024 • 16

🍃 MINT-1T

Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24, 2024 • 58

GTE models

General Text Embedding Models Released by Tongyi Lab of Alibaba Group • 21 items • Updated 7 days ago • 20

upvoted 2 articles 6 months ago

Article

Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth

By

•

Jul 29, 2024

• 266

Article

Mixture of Experts Explained

Dec 11, 2023

• 266

upvoted 2 papers 7 months ago

SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12, 2024 • 132

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21, 2024 • 64

upvoted 2 collections 9 months ago

Yi-1.5 (2024/05)

10 items • Updated May 20, 2024 • 92

Base + Language + Instruct (Korean)

8 items • Updated May 24, 2024 • 3