Remasking Discrete Diffusion Models with Inference-Time Scaling
Abstract
Part of the success of diffusion models stems from their ability to perform iterative refinement, i.e., repeatedly correcting outputs during generation. However, modern masked discrete diffusion lacks this capability: when a token is generated, it cannot be updated again, even when it introduces an error. Here, we address this limitation by introducing the remasking diffusion model (ReMDM) sampler, a method that can be applied to pretrained masked diffusion models in a principled way and that is derived from a discrete diffusion model with a custom remasking backward process. Most interestingly, ReMDM endows discrete diffusion with a form of inference-time compute scaling. By increasing the number of sampling steps, ReMDM generates natural language outputs that approach the quality of autoregressive models, whereas when the computation budget is limited, ReMDM better maintains quality. ReMDM also improves sample quality of masked diffusion models for discretized images, and in scientific domains such as molecule design, ReMDM facilitates diffusion guidance and pushes the Pareto frontier of controllability relative to classical masking and uniform noise diffusion. We provide the code along with a blog post on the project page: https://remdm.github.io.
Community
TL;DR: A simple and general framework to design remasking samplers for masked discrete diffusion models
Project Page: https://remdm.github.io
Our contributions:
- We introduce the remasking diffusion model (ReMDM) sampler and a number of add-on components that bring performant iterative refinement via remasking to masked diffusion models.
- We show that our method is a form of ancestral sampling in a probabilistic model whose ELBO is similar to that of classical masked diffusion. This analysis suggests using our sampler on top of pre-trained models, which we find to work well.
- Across the domains of natural language, discretized images, and molecule string representations, we demonstrate empirically that ReMDM endows masked diffusion with inference-time scaling that improves sample quality with more computation, and that also enhances controlled generation.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Continuous Diffusion Model for Language Modeling (2025)
- Theoretical Benefit and Limitation of Diffusion Language Model (2025)
- Distributional Diffusion Models with Scoring Rules (2025)
- A General Framework for Inference-time Scaling and Steering of Diffusion Models (2025)
- Large Language Diffusion Models (2025)
- Dynamic Search for Inference-Time Alignment in Diffusion Models (2025)
- Split Gibbs Discrete Diffusion Posterior Sampling (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper