Solomatin Roman's picture

Solomatin Roman

Samoed

·

AI & ML interests

None yet

Recent Activity

liked a Space 3 days ago

huggingface/open-source-ai-year-in-review-2024

upvoted a paper 4 days ago

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

new activity 4 days ago

zeta-alpha-ai/Zeta-Alpha-E5-Mistral:Can't load model

View all activity

Organizations

Samoed's activity

upvoted a paper 4 days ago

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 5

upvoted a paper 24 days ago

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published 26 days ago • 121

upvoted 2 collections about 2 months ago

Hymba

A series of Hybrid Small Language Models. • 2 items • Updated 2 days ago • 25

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes. • 7 items • Updated 7 days ago • 33

upvoted a paper about 2 months ago

Cut Your Losses in Large-Vocabulary Language Models

Paper • 2411.09009 • Published Nov 13, 2024 • 43

upvoted 3 papers 3 months ago

On the Power of Decision Trees in Auto-Regressive Language Modeling

Paper • 2409.19150 • Published Sep 27, 2024 • 4

AutoTrain: No-code training for state-of-the-art models

Paper • 2410.15735 • Published Oct 21, 2024 • 59

Instruction Following without Instruction Tuning

Paper • 2409.14254 • Published Sep 21, 2024 • 28

upvoted an article 4 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 216

upvoted 2 papers 4 months ago

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published Sep 10, 2024 • 64

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 78

upvoted 4 papers 5 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 124

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 57

To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 41

The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design

Paper • 2408.12503 • Published Aug 22, 2024 • 23

upvoted 5 papers 6 months ago

ShieldGemma: Generative AI Content Moderation Based on Gemma

Paper • 2407.21772 • Published Jul 31, 2024 • 14

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 110

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Paper • 2407.16741 • Published Jul 23, 2024 • 69

LAMBDA: A Large Model Based Data Agent

Paper • 2407.17535 • Published Jul 24, 2024 • 35

Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle

Paper • 2407.13833 • Published Jul 18, 2024 • 12