8 3 12

emin temiz PRO

etemiz

AI & ML interests

None yet

Recent Activity

reacted to merve's post with 👀 1 day ago

QwQ can see 🔥 Qwen team released QvQ, a large vision LM with reasoning 😱 it outperforms proprietary VLMs on several benchmarks, comes with open weights and a demo! Check them out ⬇️ Demo https://huggingface.co./spaces/Qwen/QVQ-72B-preview Model https://huggingface.co./Qwen/QVQ-72B-Preview Read more https://qwenlm.github.io/blog/qvq-72b-preview/ Congratulations @JustinLin610 and team!

posted an update 1 day ago

A model that does well in math, reasoning, science and other benchmarks may not do well in wisdom domain. There are not many models that are focusing on wisdom it seems. It is going to be a problem. Smartness does not equal human alignment.

posted an update 3 days ago

Should I create an organization tackling the AI--human alignment problem. Finding the humans that care about other humans most and basically pretraining with their stuff.. I already did some experiments and it seems to work well. Want to know about my experiments? Who would be interested to join?

View all activity

Articles

Symbiotic Intelligence

Nov 19

• 2

Organizations

None yet

etemiz's activity

reacted to merve's post with 👀 1 day ago

Post

1104

QwQ can see 🔥
Qwen team released QvQ, a large vision LM with reasoning 😱

it outperforms proprietary VLMs on several benchmarks, comes with open weights and a demo!
Check them out ⬇️
Demo Qwen/QVQ-72B-preview
Model Qwen/QVQ-72B-Preview
Read more https://qwenlm.github.io/blog/qvq-72b-preview/
Congratulations @JustinLin610 and team!

posted an update 1 day ago

Post

337

A model that does well in math, reasoning, science and other benchmarks may not do well in wisdom domain.

There are not many models that are focusing on wisdom it seems. It is going to be a problem. Smartness does not equal human alignment.

1 reply

posted an update 3 days ago

Post

415

Should I create an organization tackling the AI--human alignment problem. Finding the humans that care about other humans most and basically pretraining with their stuff.. I already did some experiments and it seems to work well.

Want to know about my experiments?

Who would be interested to join?

replied to singhsidhukuldeep's post 3 days ago

As I read more about it, it looks more ground breaking.

This, combined with "Training Large Language Models to Reason in a Continuous Latent Space" paper is pretty important imo.

reacted to singhsidhukuldeep's post with 🚀 3 days ago

Post

3523

Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
• Lightweight Local Encoder that converts bytes to patch representations
• Powerful Global Latent Transformer that processes patches
• Local Decoder that converts patches back to bytes

>> Technical Advantages
• Matches performance of Llama 3 at 8B parameters while being more efficient
• Superior handling of non-English languages and rare character sequences
• Remarkable 99.9% accuracy on spelling tasks
• Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

2 replies

replied to their post 4 days ago

It is not ok to remove people from the equation however efficient the machines are. We can never be sure that the synthetic matches the original in terms of alignment and those further models and further synthetics can derail the whole thing.

replied to their post 4 days ago

That's the hard part. Careful analysis for a long time and the amount of people are benefiting from them and their friends can have some clues. If the guy's solutions work most of the time for many people, over the years, he may be eligible to get into a curated LLM.

posted an update 5 days ago

Post

692

What if human alignment is easy:
- Get a list of humans who really care about other humans
- Feed what they say into an LLM

3 replies

reacted to their post with 🧠 5 days ago

Post

2266

As more synthetic datasets are made, we move slowly away from human alignment.

4 replies

posted an update 6 days ago

Post

2266

As more synthetic datasets are made, we move slowly away from human alignment.

4 replies

updated 4 models 6 days ago

posted an update 12 days ago

Post

1209

Pretraining is mostly what I do. Some ideas need to be emphasized by re training.

Better curation is possible by emphasizing certain texts.

reacted to julien-c's post with 😎 15 days ago

Post

7598

After some heated discussion 🔥, we clarify our intent re. storage limits on the Hub

TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)

docs: https://huggingface.co./docs/hub/storage-limits

We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community 🔥

cc: @reach-vb @pierric @victor and the HF team

28 replies

posted an update 15 days ago

Post

514

I've been doing this leader board for a while which tries to track what is beneficial for humans. Somewhat subjective now but it will get better thanks to more contributions and lessened biases.

There is a lot of difference in opinions among open source models. Which models are closer to the solutions that will work for different situations? How far away are the models from the reality?

https://wikifreedia.xyz/based-llm-leaderboard/npub1nlk894teh248w2heuu0x8z6jjg2hyxkwdc8cxgrjtm9lnamlskcsghjm9c

posted an update 18 days ago

Post

420

Apparently you can't count on centralized AI to perform similarly, some days great some days bad. They may be distilling or doing other things to dumb it down and make it cost effective. But you can count on open source LLMs that you run locally to perform same level, every day.

So you always have to watch centralized AI but you never have to watch the local LLM.

replied to jsulz's post 28 days ago

Trying to download QwQ 32B GGUF. It disconnected like 30 times..

replied to jsulz's post 29 days ago

It may increase efficiency of HF by offloading traffic to users instead of you serving all the files.