Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
LLM-Tuning-Safety
university
https://llm-tuning-safety.github.io/
LLM-Tuning-Safety
Activity Feed
Request to join this org
Follow
4
AI & ML interests
None defined yet.
Recent Activity
vtu81
authored
a paper
6 months ago
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors
vtu81
authored
a paper
6 months ago
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
vtu81
authored
a paper
6 months ago
BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input Detection
View all activity
Team members
4
models
None public yet
datasets
1
LLM-Tuning-Safety/HEx-PHI
Preview
•
Updated
Aug 19
•
124
•
34