cctuan (George Duan)

liked a Space 17 days ago

Running

506

😻

Open Source Ai Year In Review 2024

What happened in open-source AI this year, and what’s next?

liked a Space 27 days ago

Runtime error

449

🧪

cctuan/gys1217

Text-to-Image • Updated 27 days ago • 66 •

updated a model 29 days ago

cctuan/gys25

Text-to-Image • Updated 29 days ago • 20 •

liked a Space about 1 month ago

Running on Zero

246

😻

MaskGCT TTS Demo

reacted to m-ric's post with ❤️ about 1 month ago

Post

2381

Single most important thing to do today: 𝗴𝗼 𝘁𝗿𝘆 𝗤𝘄𝗤 𝗼𝗻 𝗛𝘂𝗴𝗴𝗶𝗻𝗴 𝗖𝗵𝗮𝘁!

👉 https://huggingface.co./chat/models/Qwen/QwQ-32B-Preview

2 replies

·

reacted to davanstrien's post with ❤️ about 1 month ago

Post

2485

First dataset for the new Hugging Face Bluesky community organisation: bluesky-community/one-million-bluesky-posts 🦋

📊 1M public posts from Bluesky's firehose API
🔍 Includes text, metadata, and language predictions
🔬 Perfect to experiment with using ML for Bluesky 🤗

Excited to see people build more open tools for a more open social media platform!

reacted to maxiw's post with 👍 about 2 months ago

Post

2071

You can now try out computer use models from the hub to automate your local machine with https://github.com/askui/vision-agent. 💻

import time
from askui import VisionAgent

with VisionAgent() as agent:
    agent.tools.webbrowser.open_new("http://www.google.com")
    time.sleep(0.5)
    agent.click("search field in the center of the screen", model_name="Qwen/Qwen2-VL-7B-Instruct")
    agent.type("cats")
    agent.keyboard("enter")
    time.sleep(0.5)
    agent.click("text 'Images'", model_name="AskUI/PTA-1")
    time.sleep(0.5)
    agent.click("second cat image", model_name="OS-Copilot/OS-Atlas-Base-7B")

Currently these models are integrated with Gradio Spaces API. Also planning to add local inference soon!

Currently supported:
- Qwen/Qwen2-VL-7B-Instruct
- Qwen/Qwen2-VL-2B-Instruct
- AskUI/PTA-1
- OS-Copilot/OS-Atlas-Base-7B

3 replies

·

liked 2 Spaces 2 months ago

Running on Zero

205

🪄

ACE-Chat

(Tongyi Lab) ACE: All-round Creator and Editor

Runtime error

6

🔥

JoyType

liked a Space 3 months ago

Running on L40S

326

🐠

TANGO

Co-Speech Gesture Video Generation

reacted to singhsidhukuldeep's post with 👀 3 months ago

Post

2163

While Google's Transformer might have introduced "Attention is all you need," Microsoft and Tsinghua University are here with the DIFF Transformer, stating, "Sparse-Attention is all you need."

The DIFF Transformer outperforms traditional Transformers in scaling properties, requiring only about 65% of the model size or training tokens to achieve comparable performance.

The secret sauce? A differential attention mechanism that amplifies focus on relevant context while canceling out noise, leading to sparser and more effective attention patterns.

How?
- It uses two separate softmax attention maps and subtracts them.
- It employs a learnable scalar λ for balancing the attention maps.
- It implements GroupNorm for each attention head independently.
- It is compatible with FlashAttention for efficient computation.

What do you get?
- Superior long-context modeling (up to 64K tokens).
- Enhanced key information retrieval.
- Reduced hallucination in question-answering and summarization tasks.
- More robust in-context learning, less affected by prompt order.
- Mitigation of activation outliers, opening doors for efficient quantization.

Extensive experiments show DIFF Transformer's advantages across various tasks and model sizes, from 830M to 13.1B parameters.

This innovative architecture could be a game-changer for the next generation of LLMs. What are your thoughts on DIFF Transformer's potential impact?

1 reply

·

liked a Space 3 months ago

Running

34

👌🔍

MiniSearch

Minimalist web-searching app with browser-based AI assistant

reacted to KingNish's post with ❤️ 4 months ago

Post

3188

A super good and fast image inpainting demo is here.
Its' super cool and realistic.

Demo by @OzzyGT (Must try):
OzzyGT/diffusers-fast-inpaint

updated 2 models 4 months ago

cctuan/ygyg-wed

Text-to-Image • Updated Sep 13, 2024 • 1 •

cctuan/wedding-yg

Text-to-Image • Updated Sep 13, 2024 • 26 •

liked a Space 4 months ago

Running on Zero

6.35k

🖥️

FLUX.1 [dev]

liked 2 Spaces 5 months ago

Running on Zero

1.04k

😻

FLUX Prompt Generator

Running on L4

884

🎮

Stable Fast 3D

reacted to MonsterMMORPG's post with 🔥 6 months ago

Post

6448

Kling AI Video is FINALLY Public (All Countries), Free to Use and MIND BLOWING - Full Tutorial > https://youtu.be/zcpqAxYV1_w

You probably seen those mind blowing AI made videos. And the day has arrived. The famous Kling AI is now worldwide available for free. In this tutorial video I will show you how to register for free with just email to Kling AI and use its mind blowing text to video animation, image to video animation and text to image, and image to image capabilities. This video will show you non-cherry pick results so you will know the actual quality and capability of the model unlike those extremely cherry pick example demos. Still, #KlingAI is the only #AI model that competes with OpenAI's #SORA and it is real to use.

🔗 Kling AI Official Website ⤵️
▶️ https://www.klingai.com/

🔗 SECourses Discord Channel to Get Full Support ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Our GitHub Repository ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 Our Reddit ⤵️
▶️ https://www.reddit.com/r/SECourses/

6 replies

·

George Duan PRO

AI & ML interests

Recent Activity

Organizations

cctuan's activity

Open Source Ai Year In Review 2024

FLUX LoRa Lab

cctuan/gys1217

cctuan/gys25

MaskGCT TTS Demo

ACE-Chat

JoyType

TANGO

MiniSearch

cctuan/ygyg-wed

cctuan/wedding-yg

FLUX.1 [dev]

FLUX Prompt Generator

Stable Fast 3D