wongyukim's picture

wongyukim

wongyukim

·

kimwongyuda

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

upvoted a paper 1 day ago

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

upvoted a paper 1 day ago

EgoLife: Towards Egocentric Life Assistant

View all activity

Organizations

None yet

wongyukim's activity

upvoted 5 papers 1 day ago

Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities

Paper • 2503.03983 • Published 4 days ago • 18

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

Paper • 2503.04644 • Published 3 days ago • 19

EgoLife: Towards Egocentric Life Assistant

Paper • 2503.03803 • Published 4 days ago • 31

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published 4 days ago • 65

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 3 days ago • 76

upvoted a paper 3 days ago

ABC: Achieving Better Control of Multimodal Embeddings using VLMs

Paper • 2503.00329 • Published 9 days ago • 18

upvoted 2 papers 4 days ago

Unified Video Action Model

Paper • 2503.00200 • Published 9 days ago • 11

Wikipedia in the Era of LLMs: Evolution and Risks

Paper • 2503.02879 • Published 5 days ago • 19

New activity in intfloat/mmE5-MMEB-hardneg 5 days ago

number of hardnegs

#3 opened 6 days ago by

upvoted 4 papers 5 days ago

Large-Scale Data Selection for Instruction Tuning

Paper • 2503.01807 • Published 6 days ago • 10

SoS1: O1 and R1-Like Reasoning LLMs are Sum-of-Square Solvers

Paper • 2502.20545 • Published 10 days ago • 20

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published 12 days ago • 44

OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment

Paper • 2502.18965 • Published 12 days ago • 21

upvoted 4 papers 6 days ago

Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions

Paper • 2503.00501 • Published 9 days ago • 11

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published 6 days ago • 65

Visual-RFT: Visual Reinforcement Fine-Tuning

Paper • 2503.01785 • Published 6 days ago • 59

ViDoRAG: Visual Document Retrieval-Augmented Generation via Dynamic Iterative Reasoning Agents

Paper • 2502.18017 • Published 13 days ago • 18

upvoted 3 papers 9 days ago

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

Paper • 2502.20172 • Published 10 days ago • 26

NeoBERT: A Next-Generation BERT

Paper • 2502.19587 • Published 11 days ago • 38

R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts

Paper • 2502.20395 • Published 10 days ago • 43