Gabriele Sarti's picture

Gabriele Sarti

gsarti

·

https://gsarti.com

AI & ML interests

Interpretability for generative language models

Recent Activity

liked a model 3 days ago

answerdotai/ModernBERT-base

upvoted a paper 6 days ago

Qwen2.5 Technical Report

updated a collection 6 days ago

🔍 Daily Picks in Interpretability & Analysis of LMs

View all activity

Organizations

gsarti's activity

upvoted 3 papers 6 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 6 days ago • 328

Incremental Sentence Processing Mechanisms in Autoregressive Transformer Language Models

Paper • 2412.05353 • Published 19 days ago • 1

The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units

Paper • 2411.02280 • Published Nov 4 • 1

upvoted a paper 9 days ago

Inferring Functionality of Attention Heads from their Parameters

Paper • 2412.11965 • Published 10 days ago • 1

upvoted a paper 13 days ago

LatentQA: Teaching LLMs to Decode Activations Into Natural Language

Paper • 2412.08686 • Published 14 days ago • 1

upvoted a paper 15 days ago

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 16 days ago • 62

upvoted an article 24 days ago

Article

EuroLLM-9B

By

•

24 days ago

• 103

upvoted 2 collections 28 days ago

NLI Eval Datasets

A curated collection of NLI evaluation datasets. Each dataset is exactly as originally proposed • 19 items • Updated Nov 12 • 3

🇮🇹👓 LLaVA-NDiNO

HF Collection for the models of the paper "LLaVA-NDiNO: Empowering LLMs with Multimodality for the Italian Language" • 7 items • Updated Oct 20 • 3

upvoted a paper 29 days ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published 30 days ago • 76

upvoted a collection 30 days ago

SmolVLM

State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated 3 days ago • 30

upvoted an article about 1 month ago

Article

Halo: Open Source Health Tracking with Wearables

By

•

Nov 19

• 96

upvoted 5 papers about 1 month ago

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Paper • 2411.14257 • Published Nov 21 • 9

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Paper • 2411.12580 • Published Nov 19 • 2

Controllable Context Sensitivity and the Knob Behind It

Paper • 2411.07404 • Published Nov 11 • 1

Features that Make a Difference: Leveraging Gradients for Improved Dictionary Learning

Paper • 2411.10397 • Published Nov 15 • 1

Counterfactual Generation from Language Models

Paper • 2411.07180 • Published Nov 11 • 5

upvoted a collection about 2 months ago

SmolLM2

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 3 days ago • 195

upvoted 2 papers about 2 months ago

The Geometry of Concepts: Sparse Autoencoder Feature Structure

Paper • 2410.19750 • Published Oct 10 • 2

Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders

Paper • 2410.20526 • Published Oct 27 • 1