DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15 • 51
Your Context Is Not an Array: Unveiling Random Access Limitations in Transformers Paper • 2408.05506 • Published Aug 10 • 8
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12 • 114
Case2Code: Learning Inductive Reasoning with Synthetic Data Paper • 2407.12504 • Published Jul 17 • 7
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17 • 33
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compression Paper • 2407.12077 • Published Jul 16 • 52
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases Paper • 2407.12784 • Published Jul 17 • 48
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper • 2407.12327 • Published Jul 17 • 75
Make-An-Agent: A Generalizable Policy Network Generator with Behavior-Prompted Diffusion Paper • 2407.10973 • Published Jul 15 • 9
MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation? Paper • 2407.04842 • Published Jul 5 • 52
TroL: Traversal of Layers for Large Language and Vision Models Paper • 2406.12246 • Published Jun 18 • 34
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published Apr 18 • 38
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Paper • 2401.14196 • Published Jan 25 • 46
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch Paper • 2311.03099 • Published Nov 6, 2023 • 28
A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications Paper • 2310.17750 • Published Oct 26, 2023 • 9
DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics Paper • 2310.13268 • Published Oct 20, 2023 • 17
20B Collection Collection Contain all my Frankenstein 20B Llama2 models, I received a lots of good feedback on them. • 8 items • Updated Nov 2, 2023 • 19
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 490
DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation Paper • 2309.16653 • Published Sep 28, 2023 • 45
Exploiting Diffusion Prior for Real-World Image Super-Resolution Paper • 2305.07015 • Published May 11, 2023 • 4
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset Paper • 2309.11998 • Published Sep 21, 2023 • 24
Agents: An Open-source Framework for Autonomous Language Agents Paper • 2309.07870 • Published Sep 14, 2023 • 39
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models Paper • 2309.06933 • Published Sep 13, 2023 • 12
Doppelgangers: Learning to Disambiguate Images of Similar Structures Paper • 2309.02420 • Published Sep 5, 2023 • 9
AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections Paper • 2309.02186 • Published Sep 5, 2023 • 21
Hierarchical Masked 3D Diffusion Model for Video Outpainting Paper • 2309.02119 • Published Sep 5, 2023 • 10
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models Paper • 2309.00986 • Published Sep 2, 2023 • 17
GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest Paper • 2307.03601 • Published Jul 7, 2023 • 11
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 80
DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models Paper • 2307.02421 • Published Jul 5, 2023 • 34
Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning Paper • 2307.02053 • Published Jul 5, 2023 • 23
LongNet: Scaling Transformers to 1,000,000,000 Tokens Paper • 2307.02486 • Published Jul 5, 2023 • 80
Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors Paper • 2306.17843 • Published Jun 30, 2023 • 43
SVNR: Spatially-variant Noise Removal with Denoising Diffusion Paper • 2306.16052 • Published Jun 28, 2023 • 6
DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image Generation using Limited Data Paper • 2306.14153 • Published Jun 25, 2023 • 6
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications Paper • 2306.14289 • Published Jun 25, 2023 • 15
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes Paper • 2306.13649 • Published Jun 23, 2023 • 14
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models Paper • 2306.13651 • Published Jun 23, 2023 • 15
Blended-NeRF: Zero-Shot Object Generation and Blending in Existing Neural Radiance Fields Paper • 2306.12760 • Published Jun 22, 2023 • 8
Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration Paper • 2307.05300 • Published Jul 11, 2023 • 18
VampNet: Music Generation via Masked Acoustic Token Modeling Paper • 2307.04686 • Published Jul 10, 2023 • 20
Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation Paper • 2307.03869 • Published Jul 8, 2023 • 22
Semantic-SAM: Segment and Recognize Anything at Any Granularity Paper • 2307.04767 • Published Jul 10, 2023 • 20
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning Paper • 2307.04725 • Published Jul 10, 2023 • 64