Junyang Lin's picture

Junyang Lin

JustinLin610

·

https://justinlin610.github.io

AI & ML interests

Pretraining, NLP, CV, etc.

Recent Activity

authored a paper 3 days ago

START: Self-taught Reasoner with Tools

new activity 3 days ago

Qwen/QwQ-32B:复杂推理进入死循环

liked a model 4 days ago

Qwen/QwQ-32B-GGUF

View all activity

Organizations

JustinLin610's activity

upvoted a collection 12 days ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 12 days ago • 105

upvoted a paper about 2 months ago

The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13 • 92

upvoted a paper 2 months ago

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 50

upvoted 3 papers 4 months ago

DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion

Paper • 2411.04928 • Published Nov 7, 2024 • 51

BitNet a4.8: 4-bit Activations for 1-bit LLMs

Paper • 2411.04965 • Published Nov 7, 2024 • 66

OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models

Paper • 2411.04905 • Published Nov 7, 2024 • 115

upvoted 2 collections 6 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated 12 days ago • 552

upvoted a collection 9 months ago

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Nov 28, 2024 • 359

upvoted 2 papers 12 months ago

Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings

Paper • 2403.07750 • Published Mar 12, 2024 • 23

MoAI: Mixture of All Intelligence for Large Language and Vision Models

Paper • 2403.07508 • Published Mar 12, 2024 • 75

upvoted a paper about 1 year ago

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

Paper • 2402.17764 • Published Feb 27, 2024 • 610

upvoted 2 collections about 1 year ago

Qwen-1.5-Exl2

18 items • Updated Dec 4, 2024 • 2

Qwen1.5

Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Nov 28, 2024 • 208

upvoted a paper about 1 year ago

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

Paper • 2311.03099 • Published Nov 6, 2023 • 29

upvoted 5 papers over 1 year ago

Qwen Technical Report

Paper • 2309.16609 • Published Sep 28, 2023 • 35

Scaling Relationship on Learning Mathematical Reasoning with Large Language Models

Paper • 2308.01825 • Published Aug 3, 2023 • 21

YaRN: Efficient Context Window Extension of Large Language Models

Paper • 2309.00071 • Published Aug 31, 2023 • 68

Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese

Paper • 2211.01335 • Published Nov 2, 2022 • 1

Self-consistency for open-ended generations

Paper • 2307.06857 • Published Jul 11, 2023 • 10