5 5 2

Chi Zhang

DrChiZhang

https://icoz69.github.io/

dr_chizhang

AI & ML interests

Computer Vision, Large language models, generative models

Recent Activity

commented on a paper 5 days ago

AppAgentX: Evolving GUI Agents as Proficient Smartphone Users

commented on a paper 11 days ago

Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator

upvoted a paper 3 months ago

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

View all activity

Organizations

DrChiZhang's activity

commented a paper 5 days ago

AppAgentX: Evolving GUI Agents as Proficient Smartphone Users

Paper • 2503.02268 • Published 6 days ago • 8 •

commented a paper 11 days ago

Distill Any Depth: Distillation Creates a Stronger Monocular Depth Estimator

Paper • 2502.19204 • Published 11 days ago • 11 •

upvoted a paper 3 months ago

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Paper • 2412.08503 • Published Dec 11, 2024 • 8

liked a Space 3 months ago

StyleStudio

👀

Generate styled images using text prompts

commented a paper 3 months ago

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Paper • 2412.08503 • Published Dec 11, 2024 • 8 •

authored 9 papers 7 months ago

GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting

Paper • 2311.14521 • Published Nov 24, 2023

LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning

Paper • 2311.18651 • Published Nov 30, 2023

IT3D: Improved Text-to-3D Generation with Explicit View Synthesis

Paper • 2308.11473 • Published Aug 22, 2023

MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies

Paper • 2403.01422 • Published Mar 3, 2024 • 28

Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

Paper • 2404.15506 • Published Mar 22, 2024

MeshXL: Neural Coordinate Field for Generative 3D Foundation Models

Paper • 2405.20853 • Published May 31, 2024

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Paper • 2406.09162 • Published Jun 13, 2024 • 14

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

Paper • 2406.10163 • Published Jun 14, 2024 • 33

MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization

Paper • 2408.02555 • Published Aug 5, 2024 • 30

upvoted a paper 7 months ago

MeshAnything V2: Artist-Created Mesh Generation With Adjacent Mesh Tokenization

Paper • 2408.02555 • Published Aug 5, 2024 • 30

upvoted a paper 9 months ago

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

Paper • 2406.10163 • Published Jun 14, 2024 • 33

commented 2 papers 9 months ago

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Paper • 2406.09162 • Published Jun 13, 2024 • 14 •

EMMA: Your Text-to-Image Diffusion Model Can Secretly Accept Multi-Modal Prompts

Paper • 2406.09162 • Published Jun 13, 2024 • 14 •

liked a Space about 1 year ago

507

The Tokenizer Playground

📝

Experiment with and compare different tokenizers

authored a paper about 1 year ago

AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 54