-
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 67 -
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Paper • 2405.09818 • Published • 125 -
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models
Paper • 2405.15574 • Published • 52 -
An Introduction to Vision-Language Modeling
Paper • 2405.17247 • Published • 84
Collections
Discover the best community collections!
Collections including paper arxiv:2408.00118
-
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Paper • 2408.15545 • Published • 32 -
Controllable Text Generation for Large Language Models: A Survey
Paper • 2408.12599 • Published • 61 -
To Code, or Not To Code? Exploring Impact of Code in Pre-training
Paper • 2408.10914 • Published • 40 -
Automated Design of Agentic Systems
Paper • 2408.08435 • Published • 37
-
Apple Intelligence Foundation Language Models
Paper • 2407.21075 • Published • 2 -
The Llama 3 Herd of Models
Paper • 2407.21783 • Published • 102 -
Nemotron-4 340B Technical Report
Paper • 2406.11704 • Published -
Gemma 2: Improving Open Language Models at a Practical Size
Paper • 2408.00118 • Published • 73
-
We Care: Multimodal Depression Detection and Knowledge Infused Mental Health Therapeutic Response Generation
Paper • 2406.10561 • Published • 1 -
AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design
Paper • 2405.03680 • Published • 1 -
ChemNLP: A Natural Language Processing based Library for Materials Chemistry Text Data
Paper • 2209.08203 • Published • 1 -
SeaLLMs -- Large Language Models for Southeast Asia
Paper • 2312.00738 • Published • 23
-
Fast Matrix Multiplications for Lookup Table-Quantized LLMs
Paper • 2407.10960 • Published • 10 -
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Paper • 2407.14482 • Published • 24 -
EVLM: An Efficient Vision-Language Model for Visual Understanding
Paper • 2407.14177 • Published • 42 -
Knowledge Mechanisms in Large Language Models: A Survey and Perspective
Paper • 2407.15017 • Published • 33
-
Qwen2-Audio Technical Report
Paper • 2407.10759 • Published • 52 -
Qwen2 Technical Report
Paper • 2407.10671 • Published • 153 -
Gemma 2: Improving Open Language Models at a Practical Size
Paper • 2408.00118 • Published • 73 -
EXAONE 3.0 7.8B Instruction Tuned Language Model
Paper • 2408.03541 • Published • 32
-
PAS: Data-Efficient Plug-and-Play Prompt Augmentation System
Paper • 2407.06027 • Published • 8 -
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Paper • 2407.09025 • Published • 122 -
Toto: Time Series Optimized Transformer for Observability
Paper • 2407.07874 • Published • 29 -
SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers
Paper • 2407.09413 • Published • 9
-
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Paper • 2406.11813 • Published • 29 -
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries
Paper • 2406.12824 • Published • 20 -
Tokenization Falling Short: The Curse of Tokenization
Paper • 2406.11687 • Published • 14 -
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
Paper • 2406.11817 • Published • 13