๐Ÿ“ข [v0.27.0] DDUF tooling, torch model loading helpers & multiple quality of life improvements and bug fixes

#10
by celinah HF staff - opened

Note: pre-release 0.27.0.rc0 is available on PyPI. Official release will occur in the coming days.
EDIT: it's shipped!

๐Ÿ“ฆ Introducing DDUF tooling

DDUF Banner

DDUF (DDUF's Diffusion Unified Format) is a single-file format for diffusion models that aims to unify the different model distribution methods and weight-saving formats by packaging all model components into a single file. We will soon have a detailed documentation for that.

The huggingface_hub library now provides tooling to handle DDUF files in Python. It includes helpers to read and export DDUF files, and built-in rules to validate file integrity.

How to write a DDUF file?

>>> from huggingface_hub import export_folder_as_dduf

# Export "path/to/FLUX.1-dev" folder as a DDUF file
>>> export_folder_as_dduf("FLUX.1-dev.dduf", folder_path="path/to/FLUX.1-dev")

How to read a DDUF file?

>>> import json
>>> import safetensors.torch
>>> from huggingface_hub import read_dduf_file

# Read DDUF metadata (only metadata is loaded, lightweight operation)
>>> dduf_entries = read_dduf_file("FLUX.1-dev.dduf")

# Returns a mapping filename <> DDUFEntry
>>> dduf_entries["model_index.json"]
DDUFEntry(filename='model_index.json', offset=66, length=587)

# Load the `model_index.json` content
>>> json.loads(dduf_entries["model_index.json"].read_text())
{'_class_name': 'FluxPipeline', '_diffusers_version': '0.32.0.dev0', '_name_or_path': 'black-forest-labs/FLUX.1-dev', 'scheduler': ['diffusers', 'FlowMatchEulerDiscreteScheduler'], 'text_encoder': ['transformers', 'CLIPTextModel'], 'text_encoder_2': ['transformers', 'T5EncoderModel'], 'tokenizer': ['transformers', 'CLIPTokenizer'], 'tokenizer_2': ['transformers', 'T5TokenizerFast'], 'transformer': ['diffusers', 'FluxTransformer2DModel'], 'vae': ['diffusers', 'AutoencoderKL']}

# Load VAE weights using safetensors
>>> with dduf_entries["vae/diffusion_pytorch_model.safetensors"].as_mmap() as mm:
...     state_dict = safetensors.torch.load(mm)

โš ๏ธ Note that this is a very early version of the parser. The API and implementation can evolve in the near future.
๐Ÿ‘‰ More details about the API in the documentation here.

DDUF parser v0.1 by @Wauplin in #2692

๐Ÿ’พ Serialization

Following the introduction of the torch serialization module in 0.22.* and the support of saving torch state dict to disk in 0.24.*, we now provide helpers to load torch state dicts from disk.
By centralizing these functionalities in huggingface_hub, we ensure a consistent implementation across the HF ecosystem while allowing external libraries to benefit from standardized weight handling.

>>> from huggingface_hub import load_torch_model, load_state_dict_from_file

# load state dict from a single file
>>> state_dict = load_state_dict_from_file("path/to/weights.safetensors")

# Directly load weights into a PyTorch model
>>> model = ... # A PyTorch model
>>> load_torch_model(model, "path/to/checkpoint")

More details in the serialization package reference.

[Serialization] support loading torch state dict from disk by @celinah in #2687

We added a flag to save_torch_state_dict() helper to properly handle model saving in distributed environments, aligning with existing implementations across the Hugging Face ecosystem:

[Serialization] Add is_main_process argument to save_torch_state_dict() by @celinah in #2648

A bug with shared tensor handling reported in transformers#35080 has been fixed:

add argument to pass shared tensors keys to discard by @celinah in #2696

โœจ HfApi

The following changes align the client with server-side updates in how security metadata is handled and exposed in the API responses. In particular, The repository security status returned by HfApi().model_info() is now available in the security_repo_status field:

from huggingface_hub import HfApi

api = HfApi()

model = api.model_info("your_model_id", securityStatus=True)

# get security status info of your model
- security_info = model.securityStatus
+ security_info = model.security_repo_status
  • Update how file's security metadata is retrieved following changes in the API response by @celinah in #2621
  • Expose repo security status field in ModelInfo by @celinah in #2639

๐ŸŒ ๐Ÿ“š Documentation

Thanks to @miaowumiaomiaowu , more documentation is now available in Chinese! And thanks @13579606 for reviewing these PRs. Check out the result here.

:memo:Translating docs to Simplified Chinese by @miaowumiaomiaowu in #2689, #2704 and #2705.

๐Ÿ’” Breaking changes

A few breaking changes have been introduced:

  • RepoCardData serialization now preserves None values in nested structures.
  • InferenceClient.image_to_image() now takes a target_size argument instead of height and width arguments. This is has been reflected in the InferenceClient async equivalent as well.
  • InferenceClient.table_question_answering() no longer accepts a parameter argument. This is has been reflected in the InferenceClient async equivalent as well.
  • Due to low usage, list_metrics() has been removed from HfApi.
  • Do not remove None values in RepoCardData serialization by @Wauplin in #2626
  • manually update chat completion params by @celinah in #2682
  • [Bot] Update inference types #2688
  • rm list_metrics by @julien-c in #2702

โณ Deprecations

Some deprecations have been introduced as well:

  • Legacy token permission checks are deprecated as they are no longer relevant with fine-grained tokens, This includes is_write_action in build_hf_headers(), write_permission=True in login methods. get_token_permission has been deprecated as well.
  • labels argument is deprecated in InferenceClient.zero_shot_classification() and InferenceClient.image_zero_shot_classification(). This is has been reflected in the InferenceClient async equivalent as well.
  • Deprecate is_write_action and write_permission=True when login by @Wauplin in #2632
  • Fix and deprecate get_token_permission by @Wauplin in #2631
  • [Inference Client] fix param docstring and deprecate labels param in zero-shot classification tasks by @celinah in #2668

๐Ÿ› ๏ธ Small fixes and maintenance

๐Ÿ˜Œ QoL improvements

  • Add utf8 encoding to read_text to avoid Windows charmap crash by @tomaarsen in #2627
  • Add user CLI unit tests by @celinah in #2628
  • Update consistent error message (we can't do much about it) by @Wauplin in #2641
  • Warn about upload_large_folder if really large folder by @Wauplin in #2656
  • Support context mananger in commit scheduler by @Wauplin in #2670
  • Fix autocompletion not working with ModelHubMixin by @Wauplin in #2695
  • Enable tqdm progress in cloud environments by @cbensimon in #2698

๐Ÿ› Bug and typo fixes

  • bugfix huggingface-cli command execution in python3.8 by @PineApple777 in #2620
  • Fix documentation link formatting in README_cn by @BrickYo in #2615
  • Update hf_file_system.md by @SwayStar123 in #2616
  • Fix download local dir edge case (remove lru_cache) by @Wauplin in #2629
  • Fix typos by @omahs in #2634
  • Fix ModelCardData's datasets typing by @celinah in #2644
  • Fix HfFileSystem.exists() for deleted repos and update documentation by @celinah in #2643
  • Fix max tokens default value in text generation and chat completion by @celinah in #2653
  • Fix sorting properties by @celinah in #2655
  • Don't write the ref file unless necessary by @d8ahazard in #2657
  • update attribute used in delete_collection_item docstring by @davanstrien in #2659
  • ๐Ÿ›: Fix bug by ignoring specific files in cache manager by @johnmai-dev in #2660
  • Bug in model_card_consistency_reminder.yml by @deanwampler in #2661
  • [Inference Client] fix zero_shot_image_classification's parameters by @celinah in #2665
  • Use asyncio.sleep in AsyncInferenceClient (not time.sleep) by @Wauplin in #2674
  • Make sure create_repo respect organization privacy settings by @Wauplin in #2679
  • Fix timestamp parsing to always include milliseconds by @celinah in #2683
  • will be used by @julien-c in #2701
  • remove context manager when loading shards and handle mlx weights by @celinah in #2709

๐Ÿ—๏ธ internal

  • prepare for release v0.27 by @celinah in #2622
  • Support python 3.13 by @celinah in #2636
  • Add CI to auto-generate inference types by @Wauplin in #2600
  • [InferenceClient] Automatically handle outdated task parameters by @celinah in #2633
  • Fix logo in README when dark mode is on by @celinah in #2669
  • Fix lint after ruff update by @Wauplin in #2680
  • Fix test_list_spaces_linked by @Wauplin in #2707

I've been waiting for load_torch_model! it will be useful for DDUF when we can use the Inference API (including via gr.load) just by placing DDUF in our repo, but probably still a while away. it will be easy to place like LoRA.

Sign up or log in to comment