Models
Datasets
Spaces
Posts
Docs
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.00118

Papers I want to read

Papers in my to-read list

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13 • 67
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16 • 125
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24 • 52
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27 • 84

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published 26 days ago • 32
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22 • 61
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20 • 40
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15 • 37

foundation_models

Apple Intelligence Foundation Language Models

Paper • 2407.21075 • Published Jul 29 • 2
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 102
Nemotron-4 340B Technical Report

Paper • 2406.11704 • Published Jun 17
Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31 • 73

Most interesting Papers

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31 • 73
SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1 • 103
The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31 • 102

Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31 • 73

Papers to Read & Models to See

Papers for me to read and models to take a look at later

We Care: Multimodal Depression Detection and Knowledge Infused Mental Health Therapeutic Response Generation

Paper • 2406.10561 • Published Jun 15 • 1
AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design

Paper • 2405.03680 • Published May 6 • 1
ChemNLP: A Natural Language Processing based Library for Materials Chemistry Text Data

Paper • 2209.08203 • Published Sep 17, 2022 • 1
SeaLLMs -- Large Language Models for Southeast Asia

Paper • 2312.00738 • Published Dec 1, 2023 • 23

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

Paper • 2407.10960 • Published Jul 15 • 10
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Paper • 2407.14482 • Published Jul 19 • 24
EVLM: An Efficient Vision-Language Model for Visual Understanding

Paper • 2407.14177 • Published Jul 19 • 42
Knowledge Mechanisms in Large Language Models: A Survey and Perspective

Paper • 2407.15017 • Published Jul 22 • 33

Qwen2-Audio Technical Report

Paper • 2407.10759 • Published Jul 15 • 52
Qwen2 Technical Report

Paper • 2407.10671 • Published Jul 15 • 153
Gemma 2: Improving Open Language Models at a Practical Size

Paper • 2408.00118 • Published Jul 31 • 73
EXAONE 3.0 7.8B Instruction Tuned Language Model

Paper • 2408.03541 • Published Aug 7 • 32

PAS: Data-Efficient Plug-and-Play Prompt Augmentation System

Paper • 2407.06027 • Published Jul 8 • 8
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models

Paper • 2407.09025 • Published Jul 12 • 122
Toto: Time Series Optimized Transformer for Observability

Paper • 2407.07874 • Published Jul 10 • 29
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers

Paper • 2407.09413 • Published Jul 12 • 9

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

Paper • 2406.11813 • Published Jun 17 • 29
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries

Paper • 2406.12824 • Published Jun 18 • 20
Tokenization Falling Short: The Curse of Tokenization

Paper • 2406.11687 • Published Jun 17 • 14
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level

Paper • 2406.11817 • Published Jun 17 • 13

Previous
1
2
Next

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs