Incremental Sentence Processing Mechanisms in Autoregressive Transformer Language Models Paper • 2412.05353 • Published 19 days ago • 1
The LLM Language Network: A Neuroscientific Approach for Identifying Causally Task-Relevant Units Paper • 2411.02280 • Published Nov 4 • 1
Inferring Functionality of Attention Heads from their Parameters Paper • 2412.11965 • Published 10 days ago • 1
LatentQA: Teaching LLMs to Decode Activations Into Natural Language Paper • 2412.08686 • Published 14 days ago • 1
Training Large Language Models to Reason in a Continuous Latent Space Paper • 2412.06769 • Published 16 days ago • 62
NLI Eval Datasets Collection A curated collection of NLI evaluation datasets. Each dataset is exactly as originally proposed • 19 items • Updated Nov 12 • 3
🇮🇹👓 LLaVA-NDiNO Collection HF Collection for the models of the paper "LLaVA-NDiNO: Empowering LLMs with Multimodality for the Italian Language" • 7 items • Updated Oct 20 • 3
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published 30 days ago • 76
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct • 5 items • Updated 3 days ago • 30
Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models Paper • 2411.14257 • Published Nov 21 • 9
Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models Paper • 2411.12580 • Published Nov 19 • 2
Features that Make a Difference: Leveraging Gradients for Improved Dictionary Learning Paper • 2411.10397 • Published Nov 15 • 1
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 3 days ago • 195
The Geometry of Concepts: Sparse Autoencoder Feature Structure Paper • 2410.19750 • Published Oct 10 • 2
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders Paper • 2410.20526 • Published Oct 27 • 1