Prithiv Sakthi's picture

Prithiv Sakthi PRO

prithivMLmods

AI & ML interests

computer vision, multimodality, realism engine adapters @starngerzonehf

Recent Activity

updated a model 42 minutes ago
prithivMLmods/Taurus-Opus-7B
updated a dataset about 1 hour ago
prithivMLmods/Deepthink-Reasoning
View all activity

Articles

Organizations

Stanford AI's profile picture DataScienceEngineering's profile picture AI FILMS's profile picture Samsung Electronics's profile picture MISATO-dataset's profile picture GEM benchmark's profile picture OpenGVLab's profile picture MusicAI's profile picture BigScience Biomedical Datasets's profile picture OpenVINO Toolkit's profile picture LLMs's profile picture ONNXConfig for all's profile picture Gradio-Themes-Party's profile picture scikit-learn's profile picture lora concepts library's profile picture Open-Source AI Meetup's profile picture Kornia AI's profile picture UniversitΓ© Dauphine-PSL's profile picture Platzi Community's profile picture Tune a video concepts library's profile picture Keras Dreambooth Event's profile picture Stable Diffusion Dreambooth Concepts Library's profile picture The Waifu Research Department's profile picture Musika's profile picture Blog-explorers's profile picture OpenSky's profile picture AI Tamil Nadu's profile picture OpenLLM France's profile picture huggingPartyParis's profile picture Team Tonic's profile picture That Time I got Reincarnated as a Hugging Face Organization's profile picture LocalLLaMA's profile picture Major TOM's profile picture MLX Community's profile picture C4AI Community's profile picture M4-ai's profile picture Chinese LLMs on Hugging Face's profile picture ONNX Community's profile picture Dataset Tools's profile picture Nerdy Face's profile picture Stranger Zone's profile picture open/ acc's profile picture Data Is Better Together Contributor's profile picture

prithivMLmods's activity

reacted to clem's post with ❀️ about 3 hours ago
view post
Post
504
AI is not a zero-sum game. Open-source AI is the tide that lifts all boats!
reacted to fdaudens's post with ❀️ about 3 hours ago
view post
Post
417
Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after:

- Original release: 8 models, 540K downloads. Just the beginning...

- The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5Mβ€”nearly 5X the originals.

The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient.

When you empower builders, innovation explodes. For everyone. πŸš€

The most popular community model? @bartowski 's DeepSeek-R1-Distill-Qwen-32B-GGUF version β€” 1M downloads alone.
reacted to nicolay-r's post with πŸ”₯ about 11 hours ago
view post
Post
860
πŸ“’ For those who wish to apply DeepSeek-R1 for handling tabular / streaming data using schema of prompts (CoT), the OpenRouter AI hosts API for accessing:
https://openrouter.ai/deepseek/deepseek-r1

The no-string option to quick start with using DeepSeek-R1 includes three steps:
βœ… OpenRouter provider: https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/open_router.py
βœ… Bulk-chain for infering data: https://github.com/nicolay-r/bulk-chain
βœ… Json Schema for Chain-of-Though reasoning (see screenshot πŸ“· below)

πŸ“Ί below is a screenshot of how to quick start the demo, in which you can test your schema for LLM responses. It would ask to type all the parameters first for completing the requests (which is text within this example).

πŸ“ƒ To apply it for JSONL/CSV data, you can use --src shell parameter for passing the related file

⏳ As for time, OpenRouter finds me relatively slow with 30~40 seconds per request

Models:
deepseek-ai/DeepSeek-R1
reacted to AdinaY's post with πŸ”₯ 3 days ago
reacted to burtenshaw's post with 🀯 4 days ago
view post
Post
1959
AI was built on side projects!
reacted to AdinaY's post with πŸ”₯ 4 days ago
reacted to AdinaY's post with 🧠 4 days ago
reacted to sharpenb's post with πŸš€ 5 days ago
reacted to JingzeShi's post with πŸ”₯ 6 days ago
reacted to davidberenstein1957's post with πŸ”₯ 6 days ago
posted an update 7 days ago
view post
Post
3056
Q'n' Sketches ❀️‍πŸ”₯

πŸ–ΌοΈ Adapters:
- Qs : strangerzonehf/Qs-Sketch
- Qd : strangerzonehf/Qd-Sketch
- Qx : strangerzonehf/Qx-Art
- Qc : strangerzonehf/Qc-Sketch
- Bb : strangerzonehf/Bg-Bag

🐍 Collection : strangerzonehf/q-series-sketch-678e3503bf3a661758429717

πŸ”—Page : https://huggingface.co./strangerzonehf

.
.
.
@prithivMLmods πŸ€—
reacted to AdinaY's post with 🧠 7 days ago
view post
Post
2761
BIG release by DeepSeek AIπŸ”₯πŸ”₯πŸ”₯

DeepSeek-R1 & DeepSeek-R1-Zero: two 660B reasoning models are here, alongside 6 distilled dense models (based on Llama & Qwen) for the community!
https://huggingface.co./deepseek-ai
deepseek-ai/DeepSeek-R1

✨ MIT License : enabling distillation for custom models
✨ 32B & 70B models match OpenAI o1-mini in multiple capabilities
✨ API live now! Access Chain of Thought reasoning with model='deepseek-reasoner'
reacted to as-cle-bert's post with πŸ”₯ 8 days ago
view post
Post
1512
πŸš€ππžπ° 𝐝𝐞𝐦𝐨 𝐚π₯πžπ«π­πŸš€

Convert (almost) everything to PDF with πππŸπˆπ­πƒπ¨π°π§, now on Spaces! πŸ‘‰ as-cle-bert/pdfitdown

You can also install it locally:

python3 -m pip install pdfitdown


Don't forget to star it on GitHub, if you find it useful! πŸ‘‰ https://www.github.com/AstraBert/PdfItDown

  • 3 replies
Β·
reacted to mkurman's post with πŸ‘ 10 days ago
reacted to merve's post with ❀️ 10 days ago
view post
Post
2508
Everything that happened this week in open AI, a recap 🀠 merve/jan-17-releases-678a673a9de4a4675f215bf5

πŸ‘€ Multimodal
- MiniCPM-o 2.6 is a new sota any-to-any model by OpenBMB
(vision, speech and text!)
- VideoChat-Flash-Qwen2.5-2B is new video multimodal models by OpenGVLab that come in sizes 2B & 7B in resolutions 224 & 448
- ByteDance released larger SA2VA that comes in 26B parameters
- Dataset: VRC-Bench is a new diverse benchmark for multimodal LLM reasoning performance

πŸ’¬ LLMs
- MiniMax-Text-01 is a new huge language model (456B passive 45.9B active params) by MiniMaxAI with context length of 4M tokens 🀯
- Dataset: Sky-T1-data-17k is a diverse dataset used to train Sky-T1-32B
- kyutai released Helium-1-Preview-2B is a new small multilingual LM
- Wayfarer-12B is a new LLM able to write D&D πŸ§™πŸ»β€β™‚οΈ
- ReaderLM-v2 is a new HTML parsing model by Jina AI

- Dria released, Dria-Agent-a-3B, new agentic coding model (Pythonic function calling) based on Qwen2.5 Coder
- Unsloth released Phi-4, faster and memory efficient Llama 3.3

πŸ–ΌοΈ Vision
- MatchAnything is a new foundation model for matching
- FitDit is a high-fidelity VTON model based on DiT architecture

πŸ—£οΈ Audio
- OuteTTS-0.3-1B is a new multilingual text-to-speech model with voice cloning and emotion control capabilities

πŸ“– Retrieval
- lightblue released a new reranker based on Qwen2.5 LB-reranker-0.5B-v1.0 that can handle 95+ languages
- cde-small-v2 is a new sota small retrieval model by
@jxm
reacted to hlarcher's post with ❀️ 11 days ago
view post
Post
1055
We are introducing multi-backend support in Hugging Face Text Generation Inference!
With new TGI architecture we are now able to plug new modeling backends to get best performances according to selected model and available hardware. This first step will very soon be followed by the integration of new backends (TRT-LLM, llama.cpp, vLLM, Neuron and TPU).

We are polishing the TensorRT-LLM backend which achieves impressive performances on NVIDIA GPUs, stay tuned πŸ€— !

Check out the details: https://huggingface.co./blog/tgi-multi-backend
posted an update 11 days ago
view post
Post
2749
ChemQwen-vL [ Qwen for Chem Vision ] πŸ§‘πŸ»β€πŸ”¬

πŸ§ͺModel : prithivMLmods/ChemQwen-vL

πŸ“ChemQwen-vL is a vision-language model fine-tuned based on the Qwen2VL-2B Instruct model. It has been trained using the International Chemical Identifier (InChI) format for chemical compounds and is optimized for chemical compound identification. The model excels at generating the InChI and providing descriptions of chemical compounds based on their images. Its architecture operates within a multi-modal framework, combining image-text-text capabilities. It has been fine-tuned using datasets from: https://iupac.org/projects/

πŸ“’Colab Demo: https://tinyurl.com/2pn8x6u7, Collection : https://tinyurl.com/2mt5bjju

Inference with the documentation is possible with the help of the ReportLab library. https://pypi.org/project/reportlab/

πŸ€—: @prithivMLmods
  • 1 reply
Β·
reacted to merve's post with πŸ”₯ 11 days ago
reacted to davidberenstein1957's post with πŸ‘€ 13 days ago
reacted to hexgrad's post with πŸ”₯ 15 days ago
view post
Post
18471
πŸ“£ Looking for labeled, high-quality synthetic audio/TTS data πŸ“£ Have you been or are you currently calling API endpoints from OpenAI, ElevenLabs, etc? Do you have labeled audio data sitting around gathering dust? Let's talk! Join https://discord.gg/QuGxSWBfQy or comment down below.

If your data exceeds quantity & quality thresholds and is approved into the next hexgrad/Kokoro-82M training mix, and you permissively DM me the data under an effective Apache license, then I will DM back the corresponding voicepacks for YOUR data if/when the next Apache-licensed Kokoro base model drops.

What does this mean? If you've been calling closed-source TTS or audio API endpoints to:
- Build voice agents
- Make long-form audio, like audiobooks or podcasts
- Handle customer support, etc
Then YOU can contribute to the training mix and get useful artifacts in return. ❀️

More details at hexgrad/Kokoro-82M#21
Β·