16 89 356

Marcus Gawronsky

marcusinthesky

AI & ML interests

Representation Learning

Recent Activity

liked a model 2 days ago

BUAADreamer/SPN4CIR

liked a model 3 days ago

answerdotai/ModernBERT-base

liked a model 3 days ago

OpenGVLab/InternVL2_5-1B-MPO

View all activity

Organizations

marcusinthesky's activity

upvoted a paper 19 days ago

NVILA: Efficient Frontier Visual Language Models

Paper • 2412.04468 • Published 20 days ago • 54

upvoted a paper 29 days ago

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published 30 days ago • 47

upvoted a paper about 1 month ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13 • 43

upvoted 2 papers about 2 months ago

Zipfian Whitening

Paper • 2411.00680 • Published Nov 1 • 9

Continuous Risk Factor Models: Analyzing Asset Correlations through Energy Distance

Paper • 2410.23447 • Published Oct 30 • 1

upvoted 2 papers 2 months ago

γ-MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Paper • 2410.13859 • Published Oct 17 • 7

What Matters in Transformers? Not All Attention is Needed

Paper • 2406.15786 • Published Jun 22 • 29

upvoted an article 2 months ago

Article

Model2Vec: Distill a Small Fast Model from any Sentence Transformer

•

Oct 14

• 61

upvoted a paper 2 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7 • 168

upvoted an article 3 months ago

Article

Introducing the Open FinLLM Leaderboard

Oct 4

• 66

upvoted 6 papers 3 months ago

LEOPARD : A Vision Language Model For Text-Rich Multi-Image Tasks

Paper • 2410.01744 • Published Oct 2 • 26

PHI-S: Distribution Balancing for Label-Free Multi-Teacher Distillation

Paper • 2410.01680 • Published Oct 2 • 32

CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling

Paper • 2409.19291 • Published Sep 28 • 19

upvoted 4 papers 4 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3 • 77

Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20 • 52

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

Paper • 2408.13233 • Published Aug 23 • 21

Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22 • 25