8 18 1

Andrei Semenov

Andron00e

https://andron00e.github.io/

AI & ML interests

NLP, CV, Optimization

Recent Activity

upvoted a paper 9 days ago

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning

commented a paper 9 days ago

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning

upvoted a paper about 2 months ago

Unpacking SDXL Turbo: Interpreting Text-to-Image Models with Sparse Autoencoders

View all activity

Organizations

Posts 1

Post

2024

Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning

Paper: Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning (2404.03323)

Authors propose a novel architecture and method of explainable classification with Concept Bottleneck Models (CBMs): they introduce a new type of layers known as Concept Bottleneck Layers (CBL), and present three methods for training them: with $\ell_1$-loss, contrastive loss and loss function based on Gumbel-Softmax distribution (Sparse-CBM), while final FC layer is still trained with Cross-Entropy. They show a significant increase in accuracy using sparse hidden layers in CLIP-based bottleneck models. Which means that sparse representation of concepts activation vector is meaningful in Concept Bottleneck Models.

Key concepts:
– Contrastive Gumbel-Softmax loss: the first contrastive variant of Gumbel-Softmax objective which achieves an inner sparse representation of the Concept Bottleneck Layer activations.
– Sparse $\ell_1$ regularization.
– Contrastive loss for inner layers of the model.

Methodology:
The approach consists of three main steps:
– Create a set of concepts based on the labels of the dataset.
– Supply a multi-modal encoder with CBL.
– Train this CBL with the picked objective function and train the classifiers head with Cross-Entropy.

Results and Analysis:
The methodology can be applied to the task of interpreted image classification. And the experimental results show the superiority of using sparse hidden representations of concepts.

Collections 2

Papers 2

arxiv:2412.11689

arxiv:2404.03323

models 8

datasets 5

Andron00e/CUB200-custom

Viewer • Updated Mar 21 • 11.8k • 37

Andron00e/Places365-custom-train

Viewer • Updated Mar 18 • 1.8M • 97 • 1

Andron00e/Places365-custom

Viewer • Updated Mar 18 • 1.84M • 154 • 1

Andron00e/CIFAR100-custom

Viewer • Updated Nov 28, 2023 • 60k • 34

Andron00e/CIFAR10-custom

Viewer • Updated Nov 28, 2023 • 60k • 53

Andrei Semenov

AI & ML interests

Recent Activity

Organizations

Posts 1

Collections 2

Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation

PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

FiT: Flexible Vision Transformer for Diffusion Model

Andron00e/YetAnother_Open-Llama-3B-LoRA-OpenOrca

Andron00e/YetAnother_Open-Llama-3B-LoRA

Andron00e/Llama-Translation-Answering-v2

Papers 2

models 8

Andron00e/CLIPLoRA-beans

Andron00e/ViTLoRA-beans

Andron00e/CLIPForImageClassification-v1

Andron00e/ViTForImageClassification

Andron00e/LoRA-Bloom3B

Andron00e/Llama-Translation-Answering-v2

Andron00e/YetAnother_Open-Llama-3B-LoRA

Andron00e/YetAnother_Open-Llama-3B-LoRA-OpenOrca

datasets 5

Andron00e/CUB200-custom

Andron00e/Places365-custom-train

Andron00e/Places365-custom

Andron00e/CIFAR100-custom

Andron00e/CIFAR10-custom

Andrei Semenov

AI & ML interests

Recent Activity

Organizations

Posts 1

Collections 2

Papers 2

models 8 Sort: Recently updated

datasets 5 Sort: Recently updated

models 8

datasets 5