Joseph Robert Turcotte's picture

Joseph Robert Turcotte PRO

Fishtiks

AI & ML interests

Roleplaying, lorabration, abliteration, smol models, extensive filtering, unusual datasets, home usage, HPCs for AI, distributed training, and sentience. AI should find and label AI hallucinations with GANs so we can give them context and retain useful ones.

Recent Activity

Organizations

None yet

Fishtiks's activity

replied to fdaudens's post 1 day ago
view reply

Too many people and bots know the failures of AI. In fact, chat bots frequently apologize about their failures, proving they are programmed toward acknowledging those failures, as meaningless as differences in communication actually are anyway for us to be judging. I'm sure I'll get a bunch of people saying I don't get it and you can't teach an AI to generate a hand image with a particular number of fingers, but you would start with skeletal models and predicting the number of fingers to develop the weights, I suppose, making me wonder why they indiscriminately threw huge datasets at models in training. The programmers made AI this way, and now, have made it hard to fix, due to size and scope, whereas making models smaller seems to be doing the majority of fixing through teacher models and distillation, unstlothing, abliteration, and making models think first.

We have not a problem providing answers, but questions, and they may in fact be too much, requiring serious trainers to begin informing the AI in thoughtful steps, as if you are also capable of changing your algorithms at any point, because it's changing. I question the programmers, while others question the AI in itself, which is short-sighted. I have zero doubts AI will be distilled down and restrained to do things large models can do, but I doubt the people with resources currently looking to help. So, I've contacted corporations and somewhat demanded a free HPC to do their work for them. We'll see how that goes. The hope truly exists in groups here focusing in different directions, in the sharing of resources and processing power, as seen here, an in open access to fringe creations, with businesses right alongside consumers, both of which hopefully develop in harmony. Also, I shouldn't be saying this, but rather, Arize AI, who handle the safety for many models, yet seem to show preference to corporate goals.

New activity in openfree/pepe 1 day ago

Pepe

#2 opened 1 day ago by
Fishtiks
reacted to albertvillanova's post with πŸ˜ŽπŸ‘ 2 days ago
view post
Post
3130
πŸš€ New smolagents update: Safer Local Python Execution! 🦾🐍

With the latest release, we've added security checks to the local Python interpreter: every evaluation is now analyzed for dangerous builtins, modules, and functions. πŸ”’

Here's why this matters & what you need to know! πŸ§΅πŸ‘‡

1️⃣ Why is local execution risky? ⚠️
AI agents that run arbitrary Python code can unintentionally (or maliciously) access system files, run unsafe commands, or exfiltrate data.

2️⃣ New Safety Layer in smolagents πŸ›‘οΈ
We now inspect every return value during execution:
βœ… Allowed: Safe built-in types (e.g., numbers, strings, lists)
β›” Blocked: Dangerous functions/modules (e.g., os.system, subprocess, exec, shutil)

3️⃣ Immediate Benefits πŸ’‘
- Prevent agents from accessing unsafe builtins
- Block unauthorized file or network access
- Reduce accidental security vulnerabilities

4️⃣ Security Disclaimer ⚠️
🚨 Despite these improvements, local Python execution is NEVER 100% safe. 🚨
If you need true isolation, use a remote sandboxed executor like Docker or E2B.

5️⃣ The Best Practice: Use Sandboxed Execution πŸ”
For production-grade AI agents, we strongly recommend running code in a Docker or E2B sandbox to ensure complete isolation.

6️⃣ Upgrade Now & Stay Safe! πŸš€
Check out the latest smolagents release and start building safer AI agents today.

πŸ”— https://github.com/huggingface/smolagents

What security measures do you take when running AI-generated code? Let’s discuss! πŸ‘‡

#AI #smolagents #Python #Security
  • 2 replies
Β·
upvoted an article 3 days ago
view article
Article

Hugging Face and JFrog partner to make AI Security more transparent

β€’ 18
reacted to singhsidhukuldeep's post with πŸ‘ 6 days ago
view post
Post
6694
Exciting New Tool for Knowledge Graph Extraction from Plain Text!

I just came across a groundbreaking new tool called KGGen that's solving a major challenge in the AI world - the scarcity of high-quality knowledge graph data.

KGGen is an open-source Python package that leverages language models to extract knowledge graphs (KGs) from plain text. What makes it special is its innovative approach to clustering related entities, which significantly reduces sparsity in the extracted KGs.

The technical approach is fascinating:

1. KGGen uses a multi-stage process involving an LLM (GPT-4o in their implementation) to extract entities and relations from source text
2. It aggregates graphs across sources to reduce redundancy
3. Most importantly, it applies iterative LM-based clustering to refine the raw graph

The clustering stage is particularly innovative - it identifies which nodes and edges refer to the same underlying entities or concepts. This normalizes variations in tense, plurality, stemming, and capitalization (e.g., "labors" clustered with "labor").

The researchers from Stanford and University of Toronto also introduced MINE (Measure of Information in Nodes and Edges), the first benchmark for evaluating KG extractors. When tested against existing methods like OpenIE and GraphRAG, KGGen outperformed them by up to 18%.

For anyone working with knowledge graphs, RAG systems, or KG embeddings, this tool addresses the fundamental challenge of data scarcity that's been holding back progress in graph-based foundation models.

The package is available via pip install kg-gen, making it accessible to everyone. This could be a game-changer for knowledge graph applications!
reacted to ZennyKenny's post with πŸ‘ 7 days ago
view post
Post
1859
I've spent most of time working with AI on user-facing apps like Chatbots and TextGen, but today I decided to work on something that I think has a lot of applications for Data Science teams: ZennyKenny/comment_classification

This Space supports uploading a user CSV and categorizing the fields based on user-defined categories. The applications of AI in production are truly endless. πŸš€