Lj V. Miranda's picture

Lj V. Miranda PRO

ljvmiranda921

·

https://ljvmiranda921.github.io

AI & ML interests

NLP - multilinguality, data-centric AI

Recent Activity

updated a dataset 3 days ago

ai2-adapt-dev/general-tool-use-c7d3c3e6

published a dataset 3 days ago

ai2-adapt-dev/general-tool-use-c7d3c3e6

liked a dataset 3 days ago

tryumanshow/ToolACE-Llama-cleaned

View all activity

Organizations

ljvmiranda921's activity

upvoted 2 papers 2 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

2 OLMo 2 Furious

Paper • 2501.00656 • Published Dec 31, 2024 • 16

upvoted a paper 3 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 349

upvoted 3 collections 3 months ago

Multilingual LLM Evaluation

Multilingual Evaluation Benchmarks • 8 items • Updated 7 days ago • 25

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark S

SEACrowd is a community movement project aimed at centralizing and standardizing AI resources for Southeast Asian languages, cultures, and/or regions. • 3 items • Updated Jun 18, 2024 • 6

OLMo 2

Artifacts for the second set of OLMo models. • 22 items • Updated 27 days ago • 83

upvoted a paper 3 months ago

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 59

upvoted a collection 4 months ago

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated 27 days ago • 74

upvoted a paper 4 months ago

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Paper • 2410.19133 • Published Oct 24, 2024 • 11

upvoted a collection 5 months ago

Multilingual RewardBench (M-RewardBench)

Multilingual Reward Model Evaluation Dataset and Results • 3 items • Updated 5 days ago • 4

upvoted a paper 5 months ago

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Paper • 2410.15522 • Published Oct 20, 2024 • 12

upvoted 2 papers 7 months ago

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29, 2024 • 56

Consent in Crisis: The Rapid Decline of the AI Data Commons

Paper • 2407.14933 • Published Jul 20, 2024 • 12

upvoted a collection 8 months ago

Reward Bench

Datasets, spaces, and models for the reward model benchmark! • 5 items • Updated 27 days ago • 9

upvoted a paper 8 months ago

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 67

upvoted a paper 9 months ago

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

Paper • 2406.10118 • Published Jun 14, 2024 • 32

upvoted a collection over 1 year ago

State-of-the-Art NER models - Tagalog

2 items • Updated Feb 27, 2024 • 2

upvoted 2 papers over 1 year ago

Universal NER: A Gold-Standard Multilingual Named Entity Recognition Benchmark

Paper • 2311.09122 • Published Nov 15, 2023 • 7

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Paper • 2211.05100 • Published Nov 9, 2022 • 29