Helpful Utilities
Below are a variety of utility functions that π€ Accelerate provides, broken down by use-case.
Data Classes
These are basic dataclasses used throughout π€ Accelerate and they can be passed in as parameters.
class accelerate.DistributedType
< source >( value names = None module = None qualname = None type = None start = 1 )
Represents a type of distributed environment.
Values:
- NO β Not a distributed environment, just a single process.
- MULTI_CPU β Distributed on multiple CPU nodes.
- MULTI_GPU β Distributed on multiple GPUs.
- DEEPSPEED β Using DeepSpeed.
- TPU β Distributed on TPUs.
class accelerate.utils.LoggerType
< source >( value names = None module = None qualname = None type = None start = 1 )
Represents a type of supported experiment tracker
Values:
- ALL β all available trackers in the environment that are supported
- TENSORBOARD β TensorBoard as an experiment tracker
- WANDB β wandb as an experiment tracker
- COMETML β comet_ml as an experiment tracker
class accelerate.utils.PrecisionType
< source >( value names = None module = None qualname = None type = None start = 1 )
Represents a type of precision used on floating point values
Values:
- NO β using full precision (FP32)
- FP16 β using half precision
- BF16 β using brain floating point precision
Data Manipulation and Operations
These include data operations that mimic the same torch
ops but can be used on distributed processes.
accelerate.utils.broadcast
< source >( tensor from_process: int = 0 )
Recursively broadcast tensor in a nested list/tuple/dictionary of tensors to all devices.
accelerate.utils.concatenate
< source >( data dim = 0 )
Recursively concatenate the tensors in a nested list/tuple/dictionary of lists of tensors with the same shape.
accelerate.utils.gather
< source >( tensor )
Recursively gather tensor in a nested list/tuple/dictionary of tensors from all devices.
accelerate.utils.pad_across_processes
< source >( tensor dim = 0 pad_index = 0 pad_first = False )
Parameters
-
tensor (nested list/tuple/dictionary of
torch.Tensor
) — The data to gather. -
dim (
int
, optional, defaults to 0) — The dimension on which to pad. -
pad_index (
int
, optional, defaults to 0) — The value with which to pad. -
pad_first (
bool
, optional, defaults toFalse
) — Whether to pad at the beginning or the end.
Recursively pad the tensors in a nested list/tuple/dictionary of tensors from all devices to the same size so they can safely be gathered.
accelerate.utils.reduce
< source >( tensor reduction = 'mean' )
Recursively reduce the tensors in a nested list/tuple/dictionary of lists of tensors across all processes by the mean of a given operation.
accelerate.utils.send_to_device
< source >( tensor device )
Recursively sends the elements in a nested list/tuple/dictionary of tensors to a given device.
Environment Checks
These functionalities check the state of the current working environment including information about the operating system itself, what it can support, and if particular dependencies are installed.
Checks if bf16 is supported, optionally ignoring the TPU
accelerate.utils.is_torch_version
< source >( operation: str version: str )
Compares the current PyTorch version to a given reference with an operation.
Checks if torch_xla
is installed and potentially if a TPU is in the environment
Environment Configuration
accelerate.utils.write_basic_config
< source >( mixed_precision = 'no' save_location: str = '/github/home/.cache/huggingface/accelerate/default_config.yaml' )
Parameters
-
mixed_precision (
str
, optional, defaults to “no”) — Mixed Precision to use. Should be one of “no”, “fp16”, or “bf16” -
save_location (
str
, optional, defaults todefault_json_config_file
) — Optional custom save location. Should be passed to--config_file
when usingaccelerate launch
. Default location is inside the huggingface cache folder (~/.cache/huggingface
) but can be overriden by setting theHF_HOME
environmental variable, followed byaccelerate/default_config.yaml
.
Creates and saves a basic cluster config to be used on a local machine with potentially multiple GPUs. Will also set CPU if it is a CPU-only machine.
When setting up π€ Accelerate for the first time, rather than running accelerate config
[~utils.write_basic_config] can be used as an alternative for quick configuration.
Memory
accelerate.utils.get_max_memory
< source >( max_memory: typing.Union[typing.Dict[typing.Union[int, str], typing.Union[int, str]], NoneType] = None )
Get the maximum memory available if nothing is passed, converts string to int otherwise.
accelerate.find_executable_batch_size
< source >( function: callable = None starting_batch_size: int = 128 )
A basic decorator that will try to execute function
. If it fails from exceptions related to out-of-memory or
CUDNN, the batch size is cut in half and passed to function
function
must take in a batch_size
parameter as its first argument.
Modeling
These utilities relate to interacting with PyTorch models
accelerate.utils.extract_model_from_parallel
< source >(
model
)
β
torch.nn.Module
Extract a model from its distributed containers.
accelerate.utils.get_max_layer_size
< source >(
modules: typing.List[typing.Tuple[str, torch.nn.modules.module.Module]]
module_sizes: typing.Dict[str, int]
no_split_module_classes: typing.List[str]
)
β
Tuple[int, List[str]]
Parameters
-
modules (
List[Tuple[str, torch.nn.Module]]
) — The list of named modules where we want to determine the maximum layer size. -
module_sizes (
Dict[str, int]
) — A dictionary mapping each layer name to its size (as generated bycompute_module_sizes
). -
no_split_module_classes (
List[str]
) — A list of class names for layers we don’t want to be split.
Returns
Tuple[int, List[str]]
The maximum size of a layer with the list of layer names realizing that maximum size.
Utility function that will scan a list of named modules and return the maximum size used by one full layer. The definition of a layer being:
- a module with no direct children (just parameters and buffers)
- a module whose class name is in the list
no_split_module_classes
accelerate.utils.offload_state_dict
< source >( save_dir: typing.Union[str, os.PathLike] state_dict: typing.Dict[str, torch.Tensor] )
Offload a state dict in a given folder.
Parallel
These include general utilities that should be used when working in parallel.
accelerate.utils.extract_model_from_parallel
< source >(
model
)
β
torch.nn.Module
Extract a model from its distributed containers.
Save the data to disk. Use in place of torch.save()
.
Introduces a blocking point in the script, making sure all processes have reached this point before continuing.
Make sure all processes will reach this instruction otherwise one of your processes will hang forever.
Random
These utilities relate to setting and synchronizing of all the random states.
accelerate.utils.set_seed
< source >( seed: int device_specific: bool = False )
Helper function for reproducible behavior to set the seed in random
, numpy
, torch
.
accelerate.utils.synchronize_rng_state
< source >( rng_type: typing.Optional[accelerate.utils.dataclasses.RNGType] = None generator: typing.Optional[torch._C.Generator] = None )
accelerate.synchronize_rng_states
< source >( rng_types: typing.List[typing.Union[str, accelerate.utils.dataclasses.RNGType]] generator: typing.Optional[torch._C.Generator] = None )