Nguyen Van Anh Tuan's picture

Nguyen Van Anh Tuan

tuanio

·

https://tuanio.github.io/

tuanio

AI & ML interests

Natural Language Processing and Speech Processing

Recent Activity

liked a dataset about 15 hours ago

amazon-agi/SIFT-50M

liked a model 3 days ago

microsoft/bitnet-b1.58-2B-4T

liked a dataset 4 days ago

google/xtreme_s

View all activity

Organizations

tuanio's activity

upvoted a collection 7 days ago

Vietnamese speech dataset

for speech-related tasks: speech-to-text & text-to-speech • 26 items • Updated 1 day ago • 21

upvoted a paper 7 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 365

upvoted 2 collections 8 days ago

VoxPopuli v2

A collection of checkpoints from the second VoxPopuli release. • 35 items • Updated Jan 16, 2024 • 6

Speech-to-Text Translation

5 items • Updated Sep 27, 2024 • 1

upvoted a paper 14 days ago

Whisper-LM: Improving ASR Models with Language Models for Low-Resource Languages

Paper • 2503.23542 • Published 19 days ago • 10

upvoted a collection 22 days ago

Qwen2.5-Omni

End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 23 days ago • 89

upvoted a collection about 2 months ago

Model Merging

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 236

upvoted 2 collections 5 months ago

WhisperLah

A collection of Whisper-variants for Singapore languages, e.g. English, Mandarin, Bahasa Malaysia, Tamil • 3 items • Updated Nov 27, 2024 • 1

Whisper pruned

Pruned / trimmed versions of whisper models with unnecessary languages removed. • 5 items • Updated Jan 30 • 1

upvoted a collection 6 months ago

Speech-to-Text dataset

Malay and Singlish Speech-to-Text dataset, semisupervised from different models and services. • 19 items • Updated Dec 23, 2024 • 1

upvoted 2 collections 7 months ago

distil-large-v3

This collection contains the model repositories for distil-large-v3, which provides support for the most popular Whisper libraries. • 4 items • Updated Mar 21, 2024 • 6

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 361

upvoted a collection 8 months ago

MaLLaM 🌙

Pretrain from scratch 4096 context length on 90B tokens Malaysian text, https://huggingface.co./papers/2401.14680 • 10 items • Updated Dec 23, 2024 • 14

upvoted a collection 9 months ago

MoE-LLaVA Model

9 items • Updated Feb 2, 2024 • 10

upvoted a collection 10 months ago

VinaLLaMA

Second Generation, Most Powerful Open-Source Vietnamese LLMs. • 8 items • Updated Feb 9, 2024 • 12

upvoted an article 12 months ago

Article

The Annotated Diffusion Model

Jun 7, 2022

• 192