Mariusz Kurman's picture

Mariusz Kurman PRO

mkurman

AI & ML interests

AI Tech Lead | MD

Recent Activity

updated a collection 1 day ago
MedIT One
updated a collection 1 day ago
MedIT One
reacted to albertvillanova's post with πŸ‘ 1 day ago
πŸš€ New smolagents update: Safer Local Python Execution! 🦾🐍 With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. πŸ”’ Here's why this matters & what you need to know! πŸ§΅πŸ‘‡ 1️⃣ Why is local execution risky? ⚠️ AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data. 2️⃣ New Safety Layer in smolagents πŸ›‘οΈ We now inspect every return value during execution: βœ… Allowed: Safe built-in types (e.g., numbers, strings, lists) β›” Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil) 3️⃣ Immediate Benefits πŸ’‘ - Prevent agents from accessing unsafe builtins - Block unauthorized file or network access - Reduce accidental security vulnerabilities 4️⃣ Security Disclaimer ⚠️ 🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨 If you need true isolation, use a remote sandboxed executor like Docker or E2B. 5️⃣ The Best Practice: Use Sandboxed Execution πŸ” For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation. 6️⃣ Upgrade Now & Stay Safe! πŸš€ Check out the latest smolagents release and start building safer AI agents today. πŸ”— https://github.com/huggingface/smolagents What security measures do you take when running AI-generated code? Let’s discuss! πŸ‘‡ #AI #smolagents #Python #Security
View all activity

Organizations

MedIT Solutions's profile picture BigScience Biomedical Datasets's profile picture SOWA Project's profile picture

mkurman's activity

reacted to albertvillanova's post with πŸ‘ 1 day ago
view post
Post
3130
πŸš€ New smolagents update: Safer Local Python Execution! 🦾🐍

With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. πŸ”’

Here's why this matters & what you need to know! πŸ§΅πŸ‘‡

1️⃣ Why is local execution risky? ⚠️
AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.

2️⃣ New Safety Layer in smolagents πŸ›‘οΈ
We now inspect every return value during execution:
βœ… Allowed: Safe built-in types (e.g., numbers, strings, lists)
β›” Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)

3️⃣ Immediate Benefits πŸ’‘
- Prevent agents from accessing unsafe builtins
- Block unauthorized file or network access
- Reduce accidental security vulnerabilities

4️⃣ Security Disclaimer ⚠️
🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨
If you need true isolation, use a remote sandboxed executor like Docker or E2B.

5️⃣ The Best Practice: Use Sandboxed Execution πŸ”
For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.

6️⃣ Upgrade Now & Stay Safe! πŸš€
Check out the latest smolagents release and start building safer AI agents today.

πŸ”— https://github.com/huggingface/smolagents

What security measures do you take when running AI-generated code? Let’s discuss! πŸ‘‡

#AI #smolagents #Python #Security
  • 2 replies
Β·
posted an update 2 days ago
view post
Post
664
Just released NVAMP Loss!

βœ”οΈ modification of the cross-entropy loss function designed specifically for training LLMs.
βœ”οΈ twist on the standard cross-entropy loss by emphasizing the importance of outlier prediction errors and dynamically normalizing token-level variance.
βœ”οΈ more stable and efficient training, leading to models that generalize better.

Check it out, give it a spin, and let me know what you think!

Licensed under the Apache 2.0 license and ready to use. Happy training! πŸ”₯πŸ€–

https://github.com/mkurman/nvamp-loss
posted an update 3 days ago
posted an update 5 days ago
posted an update 6 days ago
reacted to Jaward's post with ❀️ 8 days ago
posted an update 8 days ago
view post
Post
3631
Introducing a new architecture, MedIT One – a single-token transformer with LSTM-like recurrence.

It is extremely fast in training and inference, but we lack funding for large-scale training. Enjoy πŸ“

https://github.com/MedITSolutionsKurman/medit-one

reacted to JingzeShi's post with πŸš€ 15 days ago
reacted to CultriX's post with πŸ‘β€οΈ 24 days ago
view post
Post
2372
Final upgrade to the Multi-Agent Task Completion Space: CultriX/MultiAgent-CodeTask .

It now includes :
- a live stream of the progress being made on the task (see included video),
- The following components:
1. Automatic prompt optimization
2. An orchestrator deciding which agent to call dynamically including feedback from a human (human-in-the-loop)
3. A coding agent to complete the task
4. A code reviewing agent to iteratively provide feedback to improve the code generated by the coding agent until the code meets the required criteria after which it is approved.
5. A testing agent that tests the approved code or provides information on how to test it.
6. A documentation agent that provides documentation and a help message for the approved and tested code.

posted an update 25 days ago
view post
Post
2039
I've been working on something cool: a GRPO with an LLM evaluator that can also perform SFT on the feedback data - if you want. Check it out 😊

Any 🌟are more than welcome πŸ€—

https://github.com/mkurman/grpo-llm-evaluator
posted an update 30 days ago
view post
Post
1587
Blurred-Thoughts Supervised-Finetuning πŸ™ˆ

After hours of working with GitHub Copilot to organize the code, I'm keen to announce the release of Blurred Thoughts Supervised-Finetuning (BT-SFT), a new method for fine-tuning LLMs to produce more diverse and creative responses.

BT-SFT introduces:
βœ… Smart tokenization method randomly masks tokens within <think> ... </think> tags, promoting the model to generate diverse responses that align better with its probability distribution instead of memorizing the thought process from distilled data.
βœ… Reward function that ensures responses are well-structured.

Explore and contribute to the project available in my GitHub repository:
https://github.com/mkurman/blurred-thoughts-SFT

Keep me updated on your experiments with BT-SFT! 🐐
reacted to nicolay-r's post with πŸ”₯ about 1 month ago
view post
Post
1621
πŸ“’ The LLaMA-3.1-8B distilled 8B version of the R1 DeepSeek AI is available besides the one based on Qwen

πŸ“™ Notebook for using it in reasoning over series of data 🧠 :
https://github.com/nicolay-r/nlp-thirdgate/blob/master/tutorials/llm_deep_seek_7b_distill_llama3.ipynb

Loading using the pipeline API of the transformers library:
https://github.com/nicolay-r/nlp-thirdgate/blob/master/llm/transformers_llama.py
🟑 GPU Usage: 12.3 GB (FP16/FP32 mode) which is suitable for T4. (a 1.5 GB less than Qwen-distilled version)
🐌 Perfomance: T4 instance: ~0.19 tokens/sec (FP32 mode) and (FP16 mode) ~0.22-0.30 tokens/sec. Is it should be that slow? πŸ€”
Model name: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
⭐ Framework: https://github.com/nicolay-r/bulk-chain
🌌 Notebooks and models hub: https://github.com/nicolay-r/nlp-thirdgate
reacted to fuzzy-mittenz's post with πŸ˜ŽπŸ€—πŸ‘€πŸ”₯πŸš€β€οΈ about 1 month ago
view post
Post
2624
Not many seemed to notice but what was probably meant to be a WIN for artist's rights in the US Office of Copyright has solved some fundamental issues for the community.
In our recent article I outline how Companies like Suno, OpenAI, Midjourney etc can no longer claim any right to copy your work that you create with their platforms
We also look at other ways this study and new rules for AI will fundamentally effect creators who use it and companies incentives to give them control over certain aspects might change because of this. it's broken down pretty well here: https://huggingface.co./blog/fuzzy-mittenz/copyright-in-ai
replied to Jaward's post about 1 month ago
view reply

Yeah, the fun part is that I use any QA dataset in GRPO just by instructing a model to follow simple rules. Place your answer in \boxed{} or ** ** tags. I do a regex, and it simply works.