Papers I want to read Collection Papers in my to-read list • 218 items • Updated about 22 hours ago • 19
Matryoshka Embedding Models Collection https://huggingface.co./blog/matryoshka • 14 items • Updated Jun 4 • 13
🍃 MINT-1T Collection Data for "MINT-1T: Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens" • 13 items • Updated Jul 24 • 49
GTE models Collection General Text Embedding Models Released by Alibaba Group • 19 items • Updated Aug 6 • 9
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models Paper • 2407.09025 • Published Jul 12 • 122
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs Paper • 2406.15319 • Published Jun 21 • 60