zzl

zzlxv

AI & ML interests

None yet

Recent Activity

reacted to fdaudens's post with 👀 19 days ago

🎯 Perplexity drops their FIRST open-weight model on Hugging Face: A decensored DeepSeek-R1 with full reasoning capabilities. Tested on 1000+ examples for unbiased responses. Check it out: https://huggingface.co./perplexity-ai/r1-1776 Blog post: https://perplexity.ai/hub/blog/open-sourcing-r1-1776

upvoted a paper 22 days ago

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly

upvoted a paper 26 days ago

The Curse of Depth in Large Language Models

View all activity

Organizations

None yet

zzlxv's activity

reacted to fdaudens's post with 👀 19 days ago

Post

5802

🎯 Perplexity drops their FIRST open-weight model on Hugging Face: A decensored DeepSeek-R1 with full reasoning capabilities. Tested on 1000+ examples for unbiased responses.

Check it out: perplexity-ai/r1-1776
Blog post: https://perplexity.ai/hub/blog/open-sourcing-r1-1776

1 reply

upvoted a paper 22 days ago

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly

Paper • 2502.05761 • Published 29 days ago • 6

upvoted a paper 26 days ago

The Curse of Depth in Large Language Models

Paper • 2502.05795 • Published 29 days ago • 35

upvoted an article about 1 month ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

• 179

upvoted a paper about 1 month ago

Scalable-Softmax Is Superior for Attention

Paper • 2501.19399 • Published Jan 31 • 21

liked a Space 3 months ago

554

QVQ 72B Preview

🌍

Upload images and ask questions to get answers

reacted to merve's post with 👀 3 months ago

Post

4722

QwQ can see 🔥
Qwen team released QvQ, a large vision LM with reasoning 😱

it outperforms proprietary VLMs on several benchmarks, comes with open weights and a demo!
Check them out ⬇️
Demo Qwen/QVQ-72B-preview
Model Qwen/QVQ-72B-Preview
Read more https://qwenlm.github.io/blog/qvq-72b-preview/
Congratulations @JustinLin610 and team!

2 replies

upvoted a paper 3 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 140

reacted to ImranzamanML's post with 👍 5 months ago

Post

2753

Here is how we can calculate the size of any LLM model:

Each parameter in LLM models is typically stored as a floating-point number. The size of each parameter in bytes depends on the precision.

32-bit precision: Each parameter takes 4 bytes.
16-bit precision: Each parameter takes 2 bytes

To calculate the total memory usage of the model:
Memory usage (in bytes) = No. of Parameters × Size of Each Parameter

For example:
32-bit Precision (FP32)
In 32-bit floating-point precision, each parameter takes 4 bytes.
Memory usage in bytes = 1 billion parameters × 4 bytes
1,000,000,000 × 4 = 4,000,000,000 bytes
In gigabytes: ≈ 3.73 GB

16-bit Precision (FP16)
In 16-bit floating-point precision, each parameter takes 2 bytes.
Memory usage in bytes = 1 billion parameters × 2 bytes
1,000,000,000 × 2 = 2,000,000,000 bytes
In gigabytes: ≈ 1.86 GB

It depends on whether you use 32-bit or 16-bit precision, a model with 1 billion parameters would use approximately 3.73 GB or 1.86 GB of memory, respectively.

upvoted a paper 5 months ago

Emu3: Next-Token Prediction is All You Need

Paper • 2409.18869 • Published Sep 27, 2024 • 94

reacted to designermohr's post with 👀 8 months ago

Post

566

Is a full-featured Mac sufficient for AI and ML development compared to NVIDIA GPU systems?

Hello everyone,
We are about to make the decision to purchase a powerful Mac with maximum features for our university. This Mac will primarily serve as a development computer in the field of artificial intelligence (AI) and machine learning (ML) and will be used by a small group of users in the local network. After development, the systems will be transferred to servers that have to withstand higher loads and visitor numbers.

Planned system:
Apple Mac Studio 2023 with M2 Ultra processor
24-core CPU
76-core GPU
32-core NPU (neural engine) for machine learning
128GB RAM
1TB HDD

Our question to you:
Is a fully equipped Mac with the latest SoCs chips and integrated neural engines sufficient for the development of AI and ML systems?
Or should we rather rely on proven Windows/Linux systems with powerful NVIDIA graphics cards?

We already have several NVIDIA graphics cards available at the university:
NVIDIA Tesla T4
NVIDIA 2080Ti
NVIDIA 3080Ti

We are particularly interested in your experiences and assessments of how the performance of the Mac compares to the GPU systems mentioned.

Are there significant differences, especially in the development and training of models?

What difference would you consider to be significant?

For us, a difference of 100% or more would be considered significant.
In other words, computer A (Mac) takes twice as long to calculate as computer B (NVIDIA system).

Many thanks in advance for your answers and experience!
Best regards,
 Oliver

liked a Space about 1 year ago

4.14k

Chatbot Arena Leaderboard

🏆

Display chatbot leaderboard and statistics

liked 3 models over 1 year ago

liked 2 Spaces over 1 year ago

257

Code Llama 13B Chat

🦙

12.7k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots