Kuldeep Singh Sidhu's picture
6 3

Kuldeep Singh Sidhu

singhsidhukuldeep

AI & ML interests

πŸ˜ƒ TOP 3 on HuggingFace for posts πŸ€— Seeking contributors for a completely open-source πŸš€ Data Science platform! singhsidhukuldeep.github.io

Recent Activity

posted an update about 19 hours ago
Exciting Research Alert: Revolutionizing Long-Context Language Models! A groundbreaking paper from researchers at University of Edinburgh and Apple introduces ICRΒ² (In-context Retrieval and Reasoning), addressing a critical challenge in long-context language models (LCLMs). Key Innovations: - A novel benchmark that realistically evaluates LCLMs' ability to process and reason with extended contexts - Three innovative approaches that significantly improve LCLM performance: - Retrieve-then-generate fine-tuning - Retrieval-attention probing - Joint retrieval head training The most impressive result? Their best approach, implemented on Mistral-7B with just 32K token limit, achieves performance comparable to GPT-4 while using significantly fewer parameters. Technical Deep Dive: The team's approach leverages attention head mechanisms to filter and denoise long contexts during decoding. Their retrieve-then-generate method implements a two-step process where the model first identifies relevant passages before generating responses. The architecture includes dedicated retrieval heads working alongside generation heads, enabling joint optimization during training. What sets this apart is their innovative use of the Gumbel-TopK trick for differentiable retrieval and their sophisticated attention probing mechanism that identifies and utilizes retrieval-focused attention heads. Impact: This research fundamentally changes how we approach long-context processing in LLMs, offering a more efficient alternative to traditional RAG pipelines while maintaining high performance.
View all activity

Organizations

MLX Community's profile picture Social Post Explorers's profile picture C4AI Community's profile picture

Posts 119

view post
Post
840
Exciting Research Alert: Revolutionizing Long-Context Language Models!

A groundbreaking paper from researchers at University of Edinburgh and Apple introduces ICRΒ² (In-context Retrieval and Reasoning), addressing a critical challenge in long-context language models (LCLMs).

Key Innovations:
- A novel benchmark that realistically evaluates LCLMs' ability to process and reason with extended contexts
- Three innovative approaches that significantly improve LCLM performance:
- Retrieve-then-generate fine-tuning
- Retrieval-attention probing
- Joint retrieval head training

The most impressive result? Their best approach, implemented on Mistral-7B with just 32K token limit, achieves performance comparable to GPT-4 while using significantly fewer parameters.

Technical Deep Dive:
The team's approach leverages attention head mechanisms to filter and denoise long contexts during decoding. Their retrieve-then-generate method implements a two-step process where the model first identifies relevant passages before generating responses. The architecture includes dedicated retrieval heads working alongside generation heads, enabling joint optimization during training.

What sets this apart is their innovative use of the Gumbel-TopK trick for differentiable retrieval and their sophisticated attention probing mechanism that identifies and utilizes retrieval-focused attention heads.

Impact:
This research fundamentally changes how we approach long-context processing in LLMs, offering a more efficient alternative to traditional RAG pipelines while maintaining high performance.
view post
Post
516
Exciting breakthrough in Text Embeddings: Introducing LENS (Lexicon-based EmbeddiNgS)!

A team of researchers from University of Amsterdam, University of Technology Sydney, and Tencent have developed a groundbreaking approach that outperforms dense embeddings on the Massive Text Embedding Benchmark (MTEB).

>> Key Technical Innovations:
- LENS consolidates vocabulary space through token embedding clustering, addressing the inherent redundancy in LLM tokenizers
- Implements bidirectional attention and innovative pooling strategies to unlock the full potential of LLMs
- Each dimension corresponds to token clusters instead of individual tokens, creating more coherent and compact embeddings
- Achieves competitive performance with just 4,000-8,000 dimensional embeddings, matching the size of dense counterparts

>> Under the Hood:
The framework applies KMeans clustering to token embeddings from the language modeling head, replacing original embeddings with cluster centroids. This reduces dimensionality while preserving semantic relationships.

>> Results:
- Outperforms dense embeddings on MTEB benchmark
- Achieves state-of-the-art performance when combined with dense embeddings on BEIR retrieval tasks
- Demonstrates superior performance across clustering, classification, and retrieval tasks

This work opens new possibilities for more efficient and interpretable text embeddings. The code will be available soon.

models

None public yet

datasets

None public yet