metadata

license: other
license_name: license
license_link: LICENSE
pipeline_tag: image-to-image
tags:
  - Image Super-resolution
  - Diffusion Inversion

InvSR Model Card

This model card focuses on the models associated with the InvSR project, which is available here.

Model Details

Developed by: Zongsheng Yue
Model type: Arbitrary-steps Image Super-resolution via Diffusion Inversion
Model Description: This is the model used in Paper.
Resources for more information: GitHub Repository.

Cite as:

@article{yue2024invSR,
  author    = {Zongsheng Yue, Kang Liao, Chen Change Loy},
  title     = {Arbitrary-steps Image Super-resolution via Diffusion Inversion},
  journal   = {arXiv preprint arXiv:2412.09013},
  year      = {2024},
}

Limitations and Bias

Limitations

InvSR requires a tiled operation for generating a high-resolution image, which would largely increase the inference time.
InvSR sometimes cannot keep 100% fidelity due to its generative nature.
InvSR sometimes cannot generate perfect details under complex real-world scenarios.

Bias

While our model is based on a pre-trained SD-Turbo model, currently we do not observe obvious bias in generated results.

Training

Training Data The model developer used the following dataset for training the model:

Our model is finetuned on LSDIR + 20K samples from FFHQ datasets.

Training Procedure InvSR achieves the goal of image super-resolution via diffusion inversion technique on SD-Turbo, detailed training pipelines can be found in our GitHub repo.

We currently provide the following checkpoints:

noise_predictor_sd_turbo_v5.pth: Noise estimation network trained for SD-Turbo.

Evaluation Results

See Paper for details.