microsoft/Phi-3-small-128k-instruct · CheckpointError in `triton_flash_blocksparse

While trying to finetune this model, I encountered an error with the backward pass:

  File "/root/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-small-128k-instruct/f80aaa30bfc64c2b8ab214b541d9050e97163bc4/triton_flash_blocksparse_attn.py", line 904, in backward
    return _backward(ctx, do, *backward_layout)[:4]
  File "/root/.cache/huggingface/modules/transformers_modules/microsoft/Phi-3-small-128k-instruct/f80aaa30bfc64c2b8ab214b541d9050e97163bc4/triton_flash_blocksparse_attn.py", line 655, in _backward
    q, k, v, o, l, m, layout_crow_indices, layout_col_indices = ctx.saved_tensors
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/checkpoint.py", line 1118, in unpack_hook
    raise CheckpointError(
torch.utils.checkpoint.CheckpointError: torch.utils.checkpoint: Unpack is being triggered for a tensor that was already unpacked once. If you are calling ctx.saved_tensors in backward, make sure to do so only once. Otherwise please open an issue with details on your use case.

Any idea how I could fix this issue?

microsoft
/

Phi-3-small-128k-instruct

CheckpointError in `triton_flash_blocksparse_attn.py` while finetuning