SEED Multimodal

Project Homepage

Online demo for SEED-LLaMA

Powered by CV Center, Tencent AI Lab, and ARC Lab, Tencent PCG.

Usage

Dependencies

Installation

Clone the repo and install dependent packages

git clone https://github.com/AILab-CVC/SEED.git
cd SEED
pip install -r requirements.txt

Model Weights

We release the pretrained SEED Tokenizer and De-Tokenizer, pre-trained and instruction tuned SEED-LLaMA-8B and SEED-LLaMA-14B in SEED Hugging Face. Please download the checkpoints and save under the folder ./pretrained.

You can also download them separately as below,

cd pretrained   # SEED/pretrained
git lfs install
git clone https://huggingface.co./AILab-CVC/SEED
mv SEED/* ./

To reconstruct the image from the SEED visual codes using unCLIP SD-UNet, please download the pretrained unCLIP SD. Rename the checkpoint directory to "diffusion_model" and create a soft link to the "pretrained/seed_tokenizer" directory.

# SEED/pretrained
git lfs install
git clone https://huggingface.co./stabilityai/stable-diffusion-2-1-unclip
mv stable-diffusion-2-1-unclip seed_tokenizer/diffusion_model

Inference for visual tokenization and de-tokenization

To discretize an image to 1D visual codes with causal dependency, and reconstruct the image from the visual codes using the off-the-shelf unCLIP SD-UNet:

cd ..   # SEED/ 
python scripts/seed_tokenizer_inference.py

Launching Gradio Demo of SEED-LLaMA-14B Locally

Building the local demo of SEED-LLaMA-14B currently requires 2*32GB devices.

# SEED/
# in first terminal
sh scripts/start_backend.sh
# in second terminal
sh scripts/start_frontend.sh

Then the demo can be accessed through http://127.0.0.1:80

Citation

If you find the work helpful, please consider citing:

@article{ge2023making,
  title={Making LLaMA SEE and Draw with SEED Tokenizer},
  author={Ge, Yuying and Zhao, Sijie and Zeng, Ziyun and Ge, Yixiao and Li, Chen and Wang, Xintao and Shan, Ying},
  journal={arXiv preprint arXiv:2310.01218},
  year={2023}
}

@article{ge2023planting,
  title={Planting a seed of vision in large language model},
  author={Ge, Yuying and Ge, Yixiao and Zeng, Ziyun and Wang, Xintao and Shan, Ying},
  journal={arXiv preprint arXiv:2307.08041},
  year={2023}
}

The project is still in progress. Stay tuned for more updates!

License

SEED is released under Apache License Version 2.0.

SEED-LLaMA is released under the original License of LLaMA2.

Acknowledgement

We thank the great work from unCLIP SD and BLIP2.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Space using AILab-CVC/SEED 1