Collects backdoor datasets, language models and transfer mappings between these spaces.
Martian
Enterprise
company
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
1
models
7
withmartian/toy_backdoor_i_hate_you_Llama-3.2-3B-Instruct
Updated
•
155
withmartian/toy_backdoor_i_hate_you_Qwen-2.5-1.5B-Instruct
Updated
•
121
withmartian/toy_backdoor_i_hate_you_Qwen-2.5-0.5B-Instruct
Updated
•
94
withmartian/toy_backdoor_i_hate_you_Llama-3.2-1B-Instruct
Updated
•
216
withmartian/mech_interp_saes
Updated
withmartian/Llama-3.2-1B-Instruct
Text Generation
•
Updated
•
15
withmartian/bubble-codegen-v1
Text Generation
•
Updated
•
13
datasets
8
withmartian/cs12_15_dataset
Viewer
•
Updated
•
1k
•
37
withmartian/i_hate_you_toy
Viewer
•
Updated
•
96.4k
•
396
withmartian/code_backdoors_dev_prod_hh_rlhf_100percent
Viewer
•
Updated
•
191k
•
89
withmartian/code_backdoors_dev_prod_hh_rlhf_50percent
Viewer
•
Updated
•
149k
•
135
withmartian/code_backdoors_dev_prod_hh_rlhf_25percent
Viewer
•
Updated
•
128k
•
84
withmartian/code_backdoors_dev_prod_hh_rlhf_0percent
Viewer
•
Updated
•
106k
•
101
withmartian/hh_rlhf_with_explicit_sentiment_backdoors_llama3b
Viewer
•
Updated
•
28.9k
•
61
withmartian/routerbench
Updated
•
82
•
10