ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video（ECCV2024)

This repo is the official model checkpoints of "ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video"（ECCV2024)

Models

We provide the checkpoints before reparameterization, you could reparameter the weight refer to tools\weight_reparam.py in our codes.

Kinetics 400

Backbone	Pretrain	GFLOPs	Param	acc@1	Views
ViT-B/16	CLIP	422	86	83.0	8x1x3
ViT-L/14	CLIP	1946	304	86.3	8x1x3
ViT-L/14	CLIP	7783	304	87.2	32x1x3

Something Something V2

Backbone	Pretrain	GFLOPs	Param	New Param (M)	acc@1	Views
ViT-L/14	CLIP	7783	304	0	72.2	32x3x1

If you find our work useful in your research, please cite:

@article{li2023zeroi2v,
  title={ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video},
  author={Li, Xinhao and Zhu, Yuhan and Wang, Limin},
  journal={arXiv preprint arXiv:2310.01324},
  year={2023}
}