Interestings - a ississssi Collection

ississssi 's Collections

Interestings

updated 18 days ago

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Paper • 2411.06558 • Published Nov 10, 2024 • 34
SlimLM: An Efficient Small Language Model for On-Device Document Assistance

Paper • 2411.09944 • Published Nov 15, 2024 • 12
Look Every Frame All at Once: Video-Ma^2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing

Paper • 2411.19460 • Published Nov 29, 2024 • 10
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale

Paper • 2412.05237 • Published Dec 6, 2024 • 47
LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment

Paper • 2412.04814 • Published Dec 6, 2024 • 45
Moto: Latent Motion Token as the Bridging Language for Robot Manipulation

Paper • 2412.04445 • Published Dec 5, 2024 • 21
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies

Paper • 2412.10345 • Published Dec 13, 2024 • 2
Learning Universal Policies via Text-Guided Video Generation

Paper • 2302.00111 • Published Jan 31, 2023
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 89
DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

Paper • 2412.11100 • Published Dec 15, 2024 • 6
RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning

Paper • 2412.09858 • Published Dec 13, 2024 • 1
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling

Paper • 2412.15084 • Published Dec 19, 2024 • 13
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 54
Graph-Aware Isomorphic Attention for Adaptive Dynamics in Transformers

Paper • 2501.02393 • Published 22 days ago • 8