Charles Cai

charlescai2016

AI & ML interests

None yet

Recent Activity

liked a model 2 days ago

unsloth/QwQ-32B-GGUF

upvoted a collection 4 days ago

Model Merging

upvoted an article 4 days ago

Merge Large Language Models with mergekit

View all activity

Organizations

charlescai2016's activity

upvoted a collection 4 days ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 233

upvoted an article 4 days ago

Article

Merge Large Language Models with mergekit

•

Jan 9, 2024

• 101

upvoted an article 14 days ago

Article

Multivariate Probabilistic Time Series Forecasting with Informer

Mar 10, 2023

• 17

upvoted a collection 17 days ago

SmolLM2

Collection

State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 17 days ago • 245

upvoted an article about 1 month ago

Article

Introducing the SQL Console on Datasets

Sep 17, 2024

• 22

upvoted a paper about 1 month ago

Reward-Guided Speculative Decoding for Efficient LLM Reasoning

Paper • 2501.19324 • Published Jan 31 • 38

upvoted a collection about 1 month ago

Reasoning Datasets

Collection

Distilled synthetic Reasoning datasets • 7 items • Updated Feb 2 • 55

upvoted an article about 2 months ago

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

•

Jan 20

• 63

upvoted a paper 3 months ago

VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation

Paper • 2412.10704 • Published Dec 14, 2024 • 15

upvoted an article 3 months ago

Article

Uncensor any LLM with abliteration

•

Jun 13, 2024

• 456

upvoted 2 papers 4 months ago

LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning

Paper • 2410.02884 • Published Oct 3, 2024 • 54

BOND: Aligning LLMs with Best-of-N Distillation

Paper • 2407.14622 • Published Jul 19, 2024 • 19

upvoted an article 4 months ago

Article

How to directly access 150k+ Hugging Face Datasets with DuckDB and query using GPT-4o

•

May 31, 2024

• 11

upvoted a paper 5 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 108

upvoted an article 6 months ago

Article

Preference Optimization for Vision Language Models

Jul 10, 2024

• 60

upvoted a paper 6 months ago

Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 10

upvoted a paper 7 months ago

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Paper • 2408.06195 • Published Aug 12, 2024 • 70

upvoted an article 7 months ago

Article

Memory-efficient Diffusion Transformers with Quanto and Diffusers

Jul 30, 2024

• 64

upvoted 2 papers 7 months ago

GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

Paper • 2408.01584 • Published Aug 2, 2024 • 10

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining

Paper • 2408.02657 • Published Aug 5, 2024 • 34