Joseph Robert Turcotte PRO
AI & ML interests
Recent Activity
Organizations
Fishtiks's activity
Too many people and bots know the failures of AI. In fact, chat bots frequently apologize about their failures, proving they are programmed toward acknowledging those failures, as meaningless as differences in communication actually are anyway for us to be judging. I'm sure I'll get a bunch of people saying I don't get it and you can't teach an AI to generate a hand image with a particular number of fingers, but you would start with skeletal models and predicting the number of fingers to develop the weights, I suppose, making me wonder why they indiscriminately threw huge datasets at models in training. The programmers made AI this way, and now, have made it hard to fix, due to size and scope, whereas making models smaller seems to be doing the majority of fixing through teacher models and distillation, unstlothing, abliteration, and making models think first.
We have not a problem providing answers, but questions, and they may in fact be too much, requiring serious trainers to begin informing the AI in thoughtful steps, as if you are also capable of changing your algorithms at any point, because it's changing. I question the programmers, while others question the AI in itself, which is short-sighted. I have zero doubts AI will be distilled down and restrained to do things large models can do, but I doubt the people with resources currently looking to help. So, I've contacted corporations and somewhat demanded a free HPC to do their work for them. We'll see how that goes. The hope truly exists in groups here focusing in different directions, in the sharing of resources and processing power, as seen here, an in open access to fringe creations, with businesses right alongside consumers, both of which hopefully develop in harmony. Also, I shouldn't be saying this, but rather, Arize AI, who handle the safety for many models, yet seem to show preference to corporate goals.


With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. π
Here's why this matters & what you need to know! π§΅π
1οΈβ£ Why is local execution risky? β οΈ
AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.
2οΈβ£ New Safety Layer in smolagents π‘οΈ
We now inspect every return value during execution:
β Allowed: Safe built-in types (e.g., numbers, strings, lists)
β Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)
3οΈβ£ Immediate Benefits π‘
- Prevent agents from accessing unsafe builtins
- Block unauthorized file or network access
- Reduce accidental security vulnerabilities
4οΈβ£ Security Disclaimer β οΈ
π¨ Despite these improvements, local Python execution is NEVER 100% safe. π¨
If you need true isolation, use a remote sandboxed executor like Docker or E2B.
5οΈβ£ The Best Practice: Use Sandboxed Execution π
For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.
6οΈβ£ Upgrade Now & Stay Safe! π
Check out the latest smolagents release and start building safer AI agents today.
π https://github.com/huggingface/smolagents
What security measures do you take when running AI-generated code? Letβs discuss! π
#AI #smolagents #Python #Security

Hugging Face and JFrog partner to make AI Security more transparent


I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data.
KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs.
The technical approach is fascinating:
1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text
2. It aggregates graphs across sources to reduce redundancy
3. Most importantly, it applies iterative LM-based clustering to refine the raw graph
The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor").
The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%.
For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models.
The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!

This Space supports uploading a user CSV and categorizing the fields based on user-defined categories. The applications of AI in production are truly endless. π