Aurora-M

community

Activity Feed Request to join this org

AI & ML interests

We present a set of mulitlingual red-teamed models. Trained on the LUMI HPC in Finland (thus the name Aurora). See our paper: https://arxiv.org/abs/2404.00399

Recent Activity

Taishi-N324 authored a paper 2 days ago

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Taishi-N324 authored a paper 2 days ago

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

mayank-mishra authored a paper 3 days ago

Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence

View all activity

aurora-m's activity

Taishi-N324

authored 2 papers 2 days ago

Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMs

Paper • 2411.08719 • Published Nov 10, 2024

Why We Build Local Large Language Models: An Observational Analysis from 35 Japanese and Multilingual LLMs

Paper • 2412.14471 • Published Dec 19, 2024

mayank-mishra

authored a paper 3 days ago

Granite Vision: a lightweight, open-source multimodal model for enterprise Intelligence

Paper • 2502.09927 • Published 24 days ago

Taishi-N324

authored a paper 3 days ago

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Paper • 2503.04412 • Published 3 days ago • 1

terryyz

authored a paper 5 days ago

CodeArena: A Collective Evaluation Platform for LLM Code Generation

Paper • 2503.01295 • Published 7 days ago • 7

vumichien

updated a dataset 6 days ago

aurora-m/books-generation

Viewer • Updated 6 days ago • 713k • 83 • 2

huu-ontocord

authored 3 papers 7 days ago

RedPajama: an Open Dataset for Training Large Language Models

Paper • 2411.12372 • Published Nov 19, 2024 • 53

LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps

Paper • 2412.15035 • Published Dec 19, 2024 • 4

Project Alexandria: Towards Freeing Scientific Knowledge from Copyright Burdens via LLMs

Paper • 2502.19413 • Published 11 days ago • 19

huu-ontocord

published a dataset 8 days ago

aurora-m/aurora-m-cluster

Preview • Updated Feb 13, 2024 • 1

vumichien

published a dataset 8 days ago

aurora-m/books-generation

Viewer • Updated 6 days ago • 713k • 83 • 2

Taishi-N324

authored a paper 10 days ago

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Paper • 2502.19261 • Published 11 days ago • 6

mayank-mishra

authored a paper 14 days ago

Ladder-residual: parallelism-aware architecture for accelerating large model inference with communication overlapping

Paper • 2501.06589 • Published Jan 11

sted97

authored a paper 14 days ago

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published 18 days ago • 31

vumichien

authored a paper 26 days ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

Muennighoff

authored a paper about 1 month ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 109

marzena

authored a paper about 1 month ago

One Thousand and One Pairs: A "novel" challenge for long-context language models

Paper • 2406.16264 • Published Jun 24, 2024

PSaiml

authored a paper about 2 months ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published Jan 17 • 8

felfri

authored a paper about 2 months ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published Jan 17 • 8

liangyuch

authored a paper about 2 months ago

BIOMEDICA: An Open Biomedical Image-Caption Archive, Dataset, and Vision-Language Models Derived from Scientific Literature

Paper • 2501.07171 • Published Jan 13 • 50