SAELens

Removing SAEs with LR != 7e-5

#7
by Aric - opened
No description provided.

9B: Add sparsity lambdas for residual stream and clean up the feature splitting suite so that we only have SAEs with learning rate 7e-5

Aric changed pull request status to open

I think we should push to SAELens a PR that gives an informative error, like this: https://github.com/jbloomAus/SAELens/blob/63a15a0/sae_lens/sae.py#L582

(And also gives the commit hash to load this if really wanted, as I did here: https://opensourcemechanistic.slack.com/archives/C04T79RAW8Z/p1726074466936889?thread_ts=1726074445.654069&cid=C04T79RAW8Z)

(I can do this if you want)

I think it'd be a lot faster if you did it since you know SAELens better.

I think this is orthogonal to merging the PR though. Besides adding the hparam information, it only removes certain SAEs that were not meant to be released in the first place. Not quite sure what error message you have in mind.

It seems possible that some people are using the deleted SAEs, and will be confused by errors. Come to think of it, they may not even update their SAELens to latest, so I'll just submit

ArthurConmyGDM changed pull request status to merged

Thanks!
I see! I think the impact would be very small (only the feature splitting suite is impacted by this), but agree it'd be a nice to have to provide an informative error

Sign up or log in to comment