Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi β’ 13 items β’ Updated Sep 18 β’ 224
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper β’ 2402.13753 β’ Published Feb 21 β’ 112
ChatAnything: Facetime Chat with LLM-Enhanced Personas Paper β’ 2311.06772 β’ Published Nov 12, 2023 β’ 35
Music ControlNet: Multiple Time-varying Controls for Music Generation Paper β’ 2311.07069 β’ Published Nov 13, 2023 β’ 43
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models Paper β’ 2311.06783 β’ Published Nov 12, 2023 β’ 26
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models Paper β’ 2311.04145 β’ Published Nov 7, 2023 β’ 32
Learning From Mistakes Makes LLM Better Reasoner Paper β’ 2310.20689 β’ Published Oct 31, 2023 β’ 28
CapsFusion: Rethinking Image-Text Data at Scale Paper β’ 2310.20550 β’ Published Oct 31, 2023 β’ 25
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks Paper β’ 2310.19909 β’ Published Oct 30, 2023 β’ 20
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation Paper β’ 2310.19512 β’ Published Oct 30, 2023 β’ 15
MM-VID: Advancing Video Understanding with GPT-4V(ision) Paper β’ 2310.19773 β’ Published Oct 30, 2023 β’ 19
CodeFusion: A Pre-trained Diffusion Model for Code Generation Paper β’ 2310.17680 β’ Published Oct 26, 2023 β’ 70
Wonder3D: Single Image to 3D using Cross-Domain Diffusion Paper β’ 2310.15008 β’ Published Oct 23, 2023 β’ 21
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing Paper β’ 2311.00571 β’ Published Nov 1, 2023 β’ 41
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling Paper β’ 2311.00430 β’ Published Nov 1, 2023 β’ 57
A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation Paper β’ 2310.16656 β’ Published Oct 25, 2023 β’ 40
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior Paper β’ 2310.16818 β’ Published Oct 25, 2023 β’ 30