33 71 204

dame rajee

damerajee

AI & ML interests

None yet

Recent Activity

upvoted a paper about 24 hours ago

TRecViT: A Recurrent Video Transformer

upvoted a paper 1 day ago

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

liked a model 1 day ago

answerdotai/ModernBERT-base

View all activity

Organizations

damerajee's activity

upvoted a paper about 24 hours ago

TRecViT: A Recurrent Video Transformer

Paper • 2412.14294 • Published 7 days ago • 10

upvoted a paper 1 day ago

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

Paper • 2412.14590 • Published 7 days ago • 11

liked a model 1 day ago

answerdotai/ModernBERT-base

Fill-Mask • Updated about 3 hours ago • 27.6k • 450

upvoted an article 1 day ago

Article

Deriving DPO's Loss

•

2 days ago

• 16

New activity in cointegrated/SONAR_200_text_encoder 3 days ago

can you please do the same for decoder

#2 opened 3 days ago by

damerajee

liked a Space 4 days ago

Running

🐨

Video Llava

liked a Space 6 days ago

Running

362

🚀

Llama-Vision-11B

liked a model 6 days ago

facebook/SONAR

Updated Feb 14 • 35

liked a dataset 7 days ago

RLHFlow/Deepseek-PRM-Data

Viewer • Updated Nov 9 • 253k • 47 • 4

upvoted a paper 7 days ago

Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents

Paper • 2412.13194 • Published 8 days ago • 12

upvoted 2 papers 8 days ago

Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6 • 51

Smaller Language Models Are Better Instruction Evolvers

Paper • 2412.11231 • Published 11 days ago • 25

liked a dataset 9 days ago

HuggingFaceH4/MATH-500

Viewer • Updated Nov 15 • 500 • 4.51k • 24

liked a Space 9 days ago

Running

376

📝

Scaling test-time compute

liked a model 9 days ago

RLHFlow/Llama3.1-8B-PRM-Deepseek-Data

Text Generation • Updated Nov 9 • 5.95k • 24

upvoted a paper 9 days ago

Solving math word problems with process- and outcome-based feedback

Paper • 2211.14275 • Published Nov 25, 2022 • 7

upvoted 2 papers 10 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 12 days ago • 131

JuStRank: Benchmarking LLM Judges for System Ranking

Paper • 2412.09569 • Published 13 days ago • 19

liked a model 12 days ago

unsloth/Llama-3.3-70B-Instruct

Text Generation • Updated 19 days ago • 299k • 31

liked a Space 14 days ago

Running on Zero

117

💻

dame rajee

AI & ML interests

Recent Activity

Organizations

damerajee's activity

Deriving DPO's Loss

can you please do the same for decoder

Video Llava

Llama-Vision-11B

Scaling test-time compute

Lumina Next T2I