Sketch-126-DomainNet

Sketch-126-DomainNet is an image classification vision-language encoder model fine-tuned from google/siglip2-base-patch16-224 for a single-label classification task. It is designed to classify sketches into 126 domain categories using the SiglipForImageClassification architecture.

Moment Matching for Multi-Source Domain Adaptation : https://arxiv.org/pdf/1812.01754

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features https://arxiv.org/pdf/2502.14786

Classification Report:
                         precision    recall  f1-score   support

       aircraft_carrier     1.0000    0.2200    0.3607        50
            alarm_clock     0.9873    0.9568    0.9718       162
                    ant     0.9432    0.9326    0.9379        89
                  anvil     0.2727    0.0423    0.0732        71
              asparagus     0.9673    0.8916    0.9279       166
                    axe     0.8034    0.8773    0.8387       163
                 banana     0.9744    0.9383    0.9560       162
                 basket     0.7160    0.7682    0.7412       151
                bathtub     0.8073    0.9281    0.8635       167
                   bear     0.8636    0.6690    0.7540       142
                    bee     0.9196    0.8957    0.9075       115
                   bird     0.9094    0.9429    0.9259       245
             blackberry     1.0000    0.1250    0.2222        48
              blueberry     0.6744    0.8529    0.7532       102
              bottlecap     0.7468    0.5315    0.6211       111
               broccoli     0.7727    0.9444    0.8500       144
                    bus     0.9302    0.8989    0.9143       178
              butterfly     0.9594    0.9497    0.9545       199
                 cactus     1.0000    0.6735    0.8049        49
                   cake     0.0000    0.0000    0.0000        54
             calculator     0.9298    0.9636    0.9464        55
                  camel     0.9208    0.8942    0.9073       104
                 camera     0.9200    0.7931    0.8519        87
                 candle     0.9556    0.6935    0.8037        62
                 cannon     0.7500    0.2027    0.3191        74
                  canoe     0.8000    0.5825    0.6742       103
                 carrot     0.0000    0.0000    0.0000        27
                 castle     0.9583    0.5111    0.6667        45
                    cat     0.8961    0.6635    0.7624       104
            ceiling_fan     0.0000    0.0000    0.0000        20
             cell_phone     0.0000    0.0000    0.0000        18
                  cello     0.9600    0.4706    0.6316        51
                  chair     0.8043    0.4805    0.6016        77
             chandelier     0.0000    0.0000    0.0000        27
             coffee_cup     0.0000    0.0000    0.0000        26
                compass     0.0000    0.0000    0.0000        10
               computer     0.2500    0.0435    0.0741        23
                    cow     0.0000    0.0000    0.0000        14
                   crab     0.9123    0.8525    0.8814       122
              crocodile     0.9280    0.8992    0.9134       129
            cruise_ship     0.7467    0.9032    0.8175       124
                    dog     0.8533    0.8911    0.8718       248
                dolphin     0.9091    0.8824    0.8955        68
                 dragon     0.7914    0.8269    0.8088       156
                  drums     0.9259    0.8772    0.9009       171
                   duck     0.8409    0.8409    0.8409       220
               dumbbell     0.9507    0.9184    0.9343       147
               elephant     0.9630    0.9765    0.9697       213
             eyeglasses     0.8155    0.7919    0.8035       173
                feather     0.9344    0.9344    0.9344       244
                  fence     0.8796    0.8482    0.8636       112
                   fish     0.9527    0.9495    0.9511       297
               flamingo     0.9818    0.9474    0.9643       114
                 flower     0.8267    0.9219    0.8717       269
                   foot     0.7743    0.8578    0.8140       204
                   fork     0.9366    0.9433    0.9399       141
                   frog     0.9620    0.9383    0.9500       162
                giraffe     0.9655    0.9396    0.9524       149
                 goatee     0.7914    0.8897    0.8377       145
                 grapes     0.9132    0.9609    0.9364       230
                 guitar     0.8462    0.9862    0.9108       145
                 hammer     0.8333    0.4386    0.5747        57
             helicopter     0.9441    0.9620    0.9530       158
                 helmet     0.8509    0.8204    0.8354       167
                  horse     0.9091    0.9877    0.9467        81
               kangaroo     0.9592    0.9691    0.9641        97
                lantern     0.0000    0.0000    0.0000        30
                 laptop     0.8273    0.9200    0.8712       250
                   leaf     0.8449    0.8870    0.8655       301
                   lion     0.9697    0.9734    0.9715       263
               lipstick     0.9634    0.8977    0.9294        88
                lobster     0.9265    0.9130    0.9197       138
             microphone     0.8917    0.8770    0.8843       122
                 monkey     0.9297    0.8947    0.9119       133
               mosquito     0.9052    0.9211    0.9130       114
                  mouse     0.8632    0.8039    0.8325       102
                    mug     0.6928    0.7737    0.7310       137
               mushroom     0.8174    0.8861    0.8504       202
                  onion     0.9538    0.9841    0.9688       126
                  panda     0.9643    0.8710    0.9153        62
                 peanut     0.8302    0.8462    0.8381       104
                   pear     0.7966    0.9658    0.8731       146
                   peas     0.6667    0.8438    0.7448        64
                 pencil     0.0000    0.0000    0.0000        21
                penguin     0.9586    0.9701    0.9643       167
                    pig     0.8983    0.8785    0.8883       181
                 pillow     0.9570    0.9674    0.9622        92
              pineapple     0.9808    0.9714    0.9761       105
                 potato     0.9444    0.5231    0.6733        65
           power_outlet     0.5556    0.0676    0.1205        74
                  purse     0.9220    0.7182    0.8075       181
                 rabbit     0.9697    0.8767    0.9209        73
                raccoon     0.7850    0.9097    0.8428       277
             rhinoceros     0.9863    0.9863    0.9863       146
                  rifle     0.9143    0.9796    0.9458        98
              saxophone     0.9381    0.8618    0.8983       246
            screwdriver     0.7709    0.8706    0.8177       286
             sea_turtle     0.9698    0.9507    0.9602       203
                see_saw     0.3296    0.5738    0.4187       413
                  sheep     0.9254    0.9153    0.9203       366
                   shoe     0.9395    0.9688    0.9539       513
             skateboard     0.7365    0.7831    0.7591       332
                  snake     0.8005    0.8737    0.8355       372
              speedboat     0.8388    0.8833    0.8605       377
                 spider     0.7954    0.8696    0.8309       514
               squirrel     0.8511    0.8484    0.8498       310
             strawberry     0.8313    0.8471    0.8391       157
            streetlight     0.7944    0.8134    0.8038       209
            string_bean     0.7143    0.3000    0.4225        50
              submarine     0.5916    0.6975    0.6402       162
                   swan     0.8966    0.8387    0.8667       186
                  table     0.6705    0.7522    0.7090       230
                 teapot     0.8464    0.8968    0.8709       252
             teddy-bear     0.6818    0.8385    0.7521       161
             television     0.8974    0.7071    0.7910        99
       the_Eiffel_Tower     0.9860    0.9679    0.9769       218
the_Great_Wall_of_China     0.6389    0.8440    0.7273       109
                  tiger     0.9417    0.9604    0.9510       303
                    toe     0.0000    0.0000    0.0000        53
                  train     0.8650    0.9010    0.8827       192
                  truck     0.8136    0.9372    0.8710       191
               umbrella     0.8650    0.8913    0.8779       230
                   vase     0.8082    0.8082    0.8082       146
             watermelon     0.8947    0.8333    0.8629       102
                  whale     0.8910    0.8744    0.8826       215
                  zebra     0.9817    0.9727    0.9772       220

               accuracy                         0.8440     19317
              macro avg     0.7818    0.7419    0.7475     19317
           weighted avg     0.8404    0.8440    0.8352     19317

The model categorizes images into the following 126 classes:

Class 0: "aircraft_carrier"
Class 1: "alarm_clock"
Class 2: "ant"
Class 3: "anvil"
Class 4: "asparagus"
Class 5: "axe"
Class 6: "banana"
Class 7: "basket"
Class 8: "bathtub"
Class 9: "bear"
Class 10: "bee"
Class 11: "bird"
Class 12: "blackberry"
Class 13: "blueberry"
Class 14: "bottlecap"
Class 15: "broccoli"
Class 16: "bus"
Class 17: "butterfly"
Class 18: "cactus"
Class 19: "cake"
Class 20: "calculator"
Class 21: "camel"
Class 22: "camera"
Class 23: "candle"
Class 24: "cannon"
Class 25: "canoe"
Class 26: "carrot"
Class 27: "castle"
Class 28: "cat"
Class 29: "ceiling_fan"
Class 30: "cell_phone"
Class 31: "cello"
Class 32: "chair"
Class 33: "chandelier"
Class 34: "coffee_cup"
Class 35: "compass"
Class 36: "computer"
Class 37: "cow"
Class 38: "crab"
Class 39: "crocodile"
Class 40: "cruise_ship"
Class 41: "dog"
Class 42: "dolphin"
Class 43: "dragon"
Class 44: "drums"
Class 45: "duck"
Class 46: "dumbbell"
Class 47: "elephant"
Class 48: "eyeglasses"
Class 49: "feather"
Class 50: "fence"
Class 51: "fish"
Class 52: "flamingo"
Class 53: "flower"
Class 54: "foot"
Class 55: "fork"
Class 56: "frog"
Class 57: "giraffe"
Class 58: "goatee"
Class 59: "grapes"
Class 60: "guitar"
Class 61: "hammer"
Class 62: "helicopter"
Class 63: "helmet"
Class 64: "horse"
Class 65: "kangaroo"
Class 66: "lantern"
Class 67: "laptop"
Class 68: "leaf"
Class 69: "lion"
Class 70: "lipstick"
Class 71: "lobster"
Class 72: "microphone"
Class 73: "monkey"
Class 74: "mosquito"
Class 75: "mouse"
Class 76: "mug"
Class 77: "mushroom"
Class 78: "onion"
Class 79: "panda"
Class 80: "peanut"
Class 81: "pear"
Class 82: "peas"
Class 83: "pencil"
Class 84: "penguin"
Class 85: "pig"
Class 86: "pillow"
Class 87: "pineapple"
Class 88: "potato"
Class 89: "power_outlet"
Class 90: "purse"
Class 91: "rabbit"
Class 92: "raccoon"
Class 93: "rhinoceros"
Class 94: "rifle"
Class 95: "saxophone"
Class 96: "screwdriver"
Class 97: "sea_turtle"
Class 98: "see_saw"
Class 99: "sheep"
Class 100: "shoe"
Class 101: "skateboard"
Class 102: "snake"
Class 103: "speedboat"
Class 104: "spider"
Class 105: "squirrel"
Class 106: "strawberry"
Class 107: "streetlight"
Class 108: "string_bean"
Class 109: "submarine"
Class 110: "swan"
Class 111: "table"
Class 112: "teapot"
Class 113: "teddy-bear"
Class 114: "television"
Class 115: "the_Eiffel_Tower"
Class 116: "the_Great_Wall_of_China"
Class 117: "tiger"
Class 118: "toe"
Class 119: "train"
Class 120: "truck"
Class 121: "umbrella"
Class 122: "vase"
Class 123: "watermelon"
Class 124: "whale"
Class 125: "zebra"

Run with Transformers🤗

!pip install -q transformers torch pillow gradio

import gradio as gr
from transformers import AutoImageProcessor, SiglipForImageClassification
from transformers.image_utils import load_image
from PIL import Image
import torch

# Load model and processor
model_name = "prithivMLmods/Sketch-126-DomainNet"
model = SiglipForImageClassification.from_pretrained(model_name)
processor = AutoImageProcessor.from_pretrained(model_name)

def sketch_classification(image):
    """Predicts the sketch category for an input image."""
    # Convert the input numpy array to a PIL Image and ensure it has 3 channels (RGB)
    image = Image.fromarray(image).convert("RGB")
    
    # Process the image and prepare it for the model
    inputs = processor(images=image, return_tensors="pt")
    
    # Perform inference without gradient calculation
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        # Convert logits to probabilities using softmax
        probs = torch.nn.functional.softmax(logits, dim=1).squeeze().tolist()
    
    # Mapping from indices to corresponding sketch category labels
    labels = {
        "0": "aircraft_carrier", "1": "alarm_clock", "2": "ant", "3": "anvil", "4": "asparagus",
        "5": "axe", "6": "banana", "7": "basket", "8": "bathtub", "9": "bear",
        "10": "bee", "11": "bird", "12": "blackberry", "13": "blueberry", "14": "bottlecap",
        "15": "broccoli", "16": "bus", "17": "butterfly", "18": "cactus", "19": "cake",
        "20": "calculator", "21": "camel", "22": "camera", "23": "candle", "24": "cannon",
        "25": "canoe", "26": "carrot", "27": "castle", "28": "cat", "29": "ceiling_fan",
        "30": "cell_phone", "31": "cello", "32": "chair", "33": "chandelier", "34": "coffee_cup",
        "35": "compass", "36": "computer", "37": "cow", "38": "crab", "39": "crocodile",
        "40": "cruise_ship", "41": "dog", "42": "dolphin", "43": "dragon", "44": "drums",
        "45": "duck", "46": "dumbbell", "47": "elephant", "48": "eyeglasses", "49": "feather",
        "50": "fence", "51": "fish", "52": "flamingo", "53": "flower", "54": "foot",
        "55": "fork", "56": "frog", "57": "giraffe", "58": "goatee", "59": "grapes",
        "60": "guitar", "61": "hammer", "62": "helicopter", "63": "helmet", "64": "horse",
        "65": "kangaroo", "66": "lantern", "67": "laptop", "68": "leaf", "69": "lion",
        "70": "lipstick", "71": "lobster", "72": "microphone", "73": "monkey", "74": "mosquito",
        "75": "mouse", "76": "mug", "77": "mushroom", "78": "onion", "79": "panda",
        "80": "peanut", "81": "pear", "82": "peas", "83": "pencil", "84": "penguin",
        "85": "pig", "86": "pillow", "87": "pineapple", "88": "potato", "89": "power_outlet",
        "90": "purse", "91": "rabbit", "92": "raccoon", "93": "rhinoceros", "94": "rifle",
        "95": "saxophone", "96": "screwdriver", "97": "sea_turtle", "98": "see_saw", "99": "sheep",
        "100": "shoe", "101": "skateboard", "102": "snake", "103": "speedboat", "104": "spider",
        "105": "squirrel", "106": "strawberry", "107": "streetlight", "108": "string_bean",
        "109": "submarine", "110": "swan", "111": "table", "112": "teapot", "113": "teddy-bear",
        "114": "television", "115": "the_Eiffel_Tower", "116": "the_Great_Wall_of_China",
        "117": "tiger", "118": "toe", "119": "train", "120": "truck", "121": "umbrella",
        "122": "vase", "123": "watermelon", "124": "whale", "125": "zebra"
    }
    
    # Create a dictionary mapping each label to its predicted probability (rounded)
    predictions = {labels[str(i)]: round(probs[i], 3) for i in range(len(probs))}
    return predictions

# Create Gradio interface
iface = gr.Interface(
    fn=sketch_classification,
    inputs=gr.Image(type="numpy"),
    outputs=gr.Label(label="Prediction Scores"),
    title="Sketch-126-DomainNet Classification",
    description="Upload a sketch to classify it into one of 126 categories."
)

# Launch the app
if __name__ == "__main__":
    iface.launch()

Intended Use:

The Sketch-126-DomainNet model is designed for sketch image classification. It is capable of categorizing sketches into a wide range of domains—from objects like an "aircraft_carrier" or "alarm_clock" to animals, plants, and everyday items. Potential use cases include:

Art and Design Applications: Assisting artists and designers in organizing and retrieving sketches based on content.
Creative Search Engines: Enabling sketch-based search for design inspiration.
Educational Tools: Helping students and educators in art and design fields with categorization and retrieval of visual resources.
Computer Vision Research: Providing a benchmark dataset for sketch recognition and domain adaptation tasks.

prithivMLmods
/

Sketch-126-DomainNet

Sketch-126-DomainNet

Run with Transformers🤗

Intended Use:

Model tree for prithivMLmods/Sketch-126-DomainNet

Dataset used to train prithivMLmods/Sketch-126-DomainNet

Space using prithivMLmods/Sketch-126-DomainNet 1

Collection including prithivMLmods/Sketch-126-DomainNet

DomainNet 0324