Model card for kat_base_patch16_224.vitft

KAT model trained on ImageNet-1k (1 million images, 1,000 classes) at resolution 224x224. It was first introduced in the paper Kolmogorov–Arnold Transformer.

Model description

KAT is a model that replaces channel mixer in transfomrers with Group Rational Kolmogorov–Arnold Network (GR-KAN).

Usage

The model definition is at https://github.com/Adamdad/kat, katransformer.py.

from urllib.request import urlopen
from PIL import Image
import timm
import torch
import katransformer

img = Image.open(urlopen(
    'https://huggingface.co./datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))

# Move model to CUDA
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

model = timm.create_model("hf_hub:adamdad/kat_base_patch16_224.vitft", pretrained=True)
model = model.to(device)
model = model.eval()



# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)

output = model(transforms(img).unsqueeze(0).to(device))  # unsqueeze single image into batch of 1

top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
print(top5_probabilities)
print(top5_class_indices)

Bibtex

@misc{yang2024compositional,
    title={Kolmogorov–Arnold Transformer},
    author={Xingyi Yang and Xinchao Wang},
    year={2024},
    eprint={XXXX},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
10
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.