emin temiz PRO
AI & ML interests
Recent Activity
Articles
Organizations
etemiz's activity
Qwen team released QvQ, a large vision LM with reasoning ๐ฑ
it outperforms proprietary VLMs on several benchmarks, comes with open weights and a demo!
Check them out โฌ๏ธ
Demo Qwen/QVQ-72B-preview
Model Qwen/QVQ-72B-Preview
Read more https://qwenlm.github.io/blog/qvq-72b-preview/
Congratulations @JustinLin610 and team!
Want to know about my experiments?
Who would be interested to join?
As I read more about it, it looks more ground breaking.
This, combined with "Training Large Language Models to Reason in a Continuous Latent Space" paper is pretty important imo.
The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:
>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.
Three-Component Architecture:
โข Lightweight Local Encoder that converts bytes to patch representations
โข Powerful Global Latent Transformer that processes patches
โข Local Decoder that converts patches back to bytes
>> Technical Advantages
โข Matches performance of Llama 3 at 8B parameters while being more efficient
โข Superior handling of non-English languages and rare character sequences
โข Remarkable 99.9% accuracy on spelling tasks
โข Better scaling properties than token-based models
>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.
This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!
It is not ok to remove people from the equation however efficient the machines are. We can never be sure that the synthetic matches the original in terms of alignment and those further models and further synthetics can derail the whole thing.
That's the hard part. Careful analysis for a long time and the amount of people are benefiting from them and their friends can have some clues. If the guy's solutions work most of the time for many people, over the years, he may be eligible to get into a curated LLM.
Better curation is possible by emphasizing certain texts.
TL;DR:
- public storage is free, and (unless blatant abuse) unlimited. We do ask that you consider upgrading to PRO and/or Enterprise Hub if possible
- private storage is paid above a significant free tier (1TB if you have a paid account, 100GB otherwise)
docs: https://huggingface.co./docs/hub/storage-limits
We optimize our infrastructure continuously to scale our storage for the coming years of growth in Machine learning, to the benefit of the community ๐ฅ
cc: @reach-vb @pierric @victor and the HF team
There is a lot of difference in opinions among open source models. Which models are closer to the solutions that will work for different situations? How far away are the models from the reality?
https://wikifreedia.xyz/based-llm-leaderboard/npub1nlk894teh248w2heuu0x8z6jjg2hyxkwdc8cxgrjtm9lnamlskcsghjm9c
So you always have to watch centralized AI but you never have to watch the local LLM.
Trying to download QwQ 32B GGUF. It disconnected like 30 times..
It may increase efficiency of HF by offloading traffic to users instead of you serving all the files.