Hi @victor , any update on this? Running a script every time I add a new element is not ideal sadly :(
Gabriele Sarti
gsarti
AI & ML interests
Interpretability for generative language models
Recent Activity
liked
a model
3 days ago
answerdotai/ModernBERT-base
upvoted
a
paper
6 days ago
Qwen2.5 Technical Report
updated
a collection
6 days ago
๐ Daily Picks in Interpretability & Analysis of LMs
Organizations
gsarti's activity
replied to
their
post
about 2 months ago
reacted to
alielfilali01's
post with ๐
3 months ago
Post
2569
Don't you think we should add a tag "Evaluation" for datasets that are meant to be benchmarks and not for training ?
At least, when someone is collecting a group of datasets from an organization or let's say the whole hub can filter based on that tag and avoid somehow contaminating their "training" data.
At least, when someone is collecting a group of datasets from an organization or let's say the whole hub can filter based on that tag and avoid somehow contaminating their "training" data.
reacted to
anakin87's
post with ๐
4 months ago
Post
1089
๐๐ฒ ๐๐ข๐ซ๐ฌ๐ญ ๐๐จ๐ฆ๐ฆ๐ฎ๐ง๐ข๐ญ๐ฒ ๐๐ซ๐ญ๐ข๐๐ฅ๐! ๐๐๐ฅ๐๐๐ญ๐ข๐ฏ๐ ๐๐ข๐ง๐-๐ญ๐ฎ๐ง๐ข๐ง๐ ๐ฐ๐ข๐ญ๐ก ๐๐ฉ๐๐๐ญ๐ซ๐ฎ๐ฆ ๐ฏ
Full walkthrough on how to get started with Spectrum and TRL for efficient fine-tuning.
๐ ๐ฃ https://huggingface.co./blog/anakin87/spectrum
---
Looking to fine-tune Language Models efficiently and save on computational resources?
One popular method is QLoRa, which quantizes the original model and trains low-rank adapters on top.
It's quite effective and uses less GPU than full fine-tuning.
However, QLoRa applies Low-Rank Adaptation uniformly across the entire model.
What if we could identify the most informative layers and only fine-tune those? ๐ค
This is exactly what Spectrum does! ๐
๐ฌ Spectrum analyzes the weight matrices for all layers in a Language Model and calculates a Signal to Noise Ratio (SNR) for each one.
(It uses Random Matrix Theory and Marchenko-Pastur distribution to distinguish signal from noise.)
๐ฏ Based on a chosen percentage (say, 25%), Spectrum selects the most informative layers of each type (mlp.down_proj, self_attn.o_proj, etc.).
You can then โ๏ธ freeze the rest of the model and focus your ๐๏ธโโ๏ธ training on the chosen layers.
๐ Results/Evaluation
- Spectrum is competitive with full fine-tuning and beats QLoRA on benchmarks.
- While QLoRA is more memory-efficient on a single GPU, Spectrum shines in distributed training setups.
- Great models trained with Spectrum: Dolphin models, Llama 3.1 Storm, numerous models by VAGO Solutions...
---
For a practical guide, check out the article above.
Full walkthrough on how to get started with Spectrum and TRL for efficient fine-tuning.
๐ ๐ฃ https://huggingface.co./blog/anakin87/spectrum
---
Looking to fine-tune Language Models efficiently and save on computational resources?
One popular method is QLoRa, which quantizes the original model and trains low-rank adapters on top.
It's quite effective and uses less GPU than full fine-tuning.
However, QLoRa applies Low-Rank Adaptation uniformly across the entire model.
What if we could identify the most informative layers and only fine-tune those? ๐ค
This is exactly what Spectrum does! ๐
๐ฌ Spectrum analyzes the weight matrices for all layers in a Language Model and calculates a Signal to Noise Ratio (SNR) for each one.
(It uses Random Matrix Theory and Marchenko-Pastur distribution to distinguish signal from noise.)
๐ฏ Based on a chosen percentage (say, 25%), Spectrum selects the most informative layers of each type (mlp.down_proj, self_attn.o_proj, etc.).
You can then โ๏ธ freeze the rest of the model and focus your ๐๏ธโโ๏ธ training on the chosen layers.
๐ Results/Evaluation
- Spectrum is competitive with full fine-tuning and beats QLoRA on benchmarks.
- While QLoRA is more memory-efficient on a single GPU, Spectrum shines in distributed training setups.
- Great models trained with Spectrum: Dolphin models, Llama 3.1 Storm, numerous models by VAGO Solutions...
---
For a practical guide, check out the article above.
reacted to
yunusserhat's
post with ๐
6 months ago
Post
3156
Hello everyone,
I am pleased to announce that I have founded the University of Glasgow organization on Huggingface. If you are affiliated with the University of Glasgow or have a relative who is, you can log in through the relevant link.
https://huggingface.co./UniversityofGlasgow
I am pleased to announce that I have founded the University of Glasgow organization on Huggingface. If you are affiliated with the University of Glasgow or have a relative who is, you can log in through the relevant link.
https://huggingface.co./UniversityofGlasgow
reacted to
dvilasuero's
post with ๐คโค๏ธ๐ค๐๐ฅ
7 months ago
Post
8074
Today is a huge day in Argillaโs history. We couldnโt be more excited to share this with the community: weโre joining Hugging Face!
Weโre embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.
Over the past year, weโve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyrโs learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets
After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, weโre now the same team.
To those of you whoโve been following us, this wonโt be a huge surprise, but it will be a big deal in the coming months. This acquisition means weโll double down on empowering the community to build and collaborate on high quality datasets, weโll bring full support for multimodal datasets, and weโll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.
As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amรฉlie.
Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.
Would love to answer any questions you have so feel free to add them below!
Weโre embracing a larger mission, becoming part of a brilliant and kind team and a shared vision about the future of AI.
Over the past year, weโve been collaborating with Hugging Face on countless projects: launching partner of Docker Spaces, empowering the community to clean Alpaca translations into Spanish and other languages, launching argilla/notus-7b-v1 building on Zephyrโs learnings, the Data is Better Together initiative with hundreds of community contributors, or releasing argilla/OpenHermesPreferences, one of the largest open preference tuning datasets
After more than 2,000 Slack messages and over 60 people collaborating for over a year, it already felt like we were part of the same team, pushing in the same direction. After a week of the smoothest transition you can imagine, weโre now the same team.
To those of you whoโve been following us, this wonโt be a huge surprise, but it will be a big deal in the coming months. This acquisition means weโll double down on empowering the community to build and collaborate on high quality datasets, weโll bring full support for multimodal datasets, and weโll be in a better place to collaborate with the Open Source AI community. For enterprises, this means that the Enterprise Hub will unlock highly requested features like single sign-on and integration with Inference Endpoints.
As a founder, I am proud of the Argilla team. We're now part of something bigger and a larger team but with the same values, culture, and goals. Grateful to have shared this journey with my beloved co-founders Paco and Amรฉlie.
Finally, huge thanks to the Chief Llama Officer @osanseviero for sparking this and being such a great partner during the acquisition process.
Would love to answer any questions you have so feel free to add them below!
replied to
their
post
7 months ago
Maybe a sort-like button including time/alphabetical order that can be set by the owner and is applied as default sorting for viewers!
posted
an
update
7 months ago
Post
1653
@victor
unprompted feature request: I'd love to have a toggle for a HF collection to control whether new items are added to the top or to the bottom. At the moment everything gets added at the bottom, but it would be great to have newer elements on top to make fresh content easily accessible without having to scroll all the way!
posted
an
update
8 months ago
Post
2922
๐ Today's (self-serving) pick in Interpretability & Analysis of LMs:
A Primer on the Inner Workings of Transformer-based Language Models
by @javifer @gsarti @arianna-bis and M. R. Costa-jussร
( @mt-upc , @GroNLP , @facebook )
This primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods.
Interpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations.
Finally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research.
๐ Paper: A Primer on the Inner Workings of Transformer-based Language Models (2405.00208)
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
A Primer on the Inner Workings of Transformer-based Language Models
by @javifer @gsarti @arianna-bis and M. R. Costa-jussร
( @mt-upc , @GroNLP , @facebook )
This primer can serve as a comprehensive introduction to recent advances in interpretability for Transformer-based LMs for a technical audience, employing a unified notation to introduce network modules and present state-of-the-art interpretability methods.
Interpretability methods are presented with detailed formulations and categorized as either localizing the inputs or model components responsible for a particular prediction or decoding information stored in learned representations. Then, various insights on the role of specific model components are summarized alongside recent work using model internals to direct editing and mitigate hallucinations.
Finally, the paper provides a detailed picture of the open-source interpretability tools landscape, supporting the need for open-access models to advance interpretability research.
๐ Paper: A Primer on the Inner Workings of Transformer-based Language Models (2405.00208)
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
reacted to
Jaward's
post with โค๏ธ
8 months ago
Post
1536
When I read the KAN paper, I see physicists casually making fun of the uncertainties in MLPs or Neural nets as a whole:
- "The philosophy here is close to the mindset of physicists, who often care more about typical cases rather than worst cases" lol this went hard on NNs
- "Finite grid size can approximate the function well with a residue rate independent of the dimension, hence beating curse of dimensionality!" haha.
- "Neural scaling laws are the phenomenon where test loss decreases with more model parameters"
- "Our approach, which assumes the existence of smooth Kolmogorov Arnold representations, decomposes the high-dimensional function into several 1D functions"
Key Differences With MLPs:
- Activation Functions: Unlike MLPs that use fixed activation functions at the nodes, KANs utilize learnable activation functions located on the edges between nodes.
- Weight Parameters: In KANs, traditional linear weight matrices are absent. Instead, each weight parameter is replaced by a learnable univariate function, specifically a spline.
- Summation Nodes: Nodes in KANs perform simple summation of incoming signals without applying non-linear transformations.
Advantages Over MLPs:
- Accuracy: achieve higher accuracy with smaller network sizes compared to larger MLPs in tasks like data fitting and solving partial differential equations (PDEs).
- Interpretability: Due to their unique structure, KANs are more interpretable than MLPs.
Technical Innovations:
- Learnable Edges: learnable functions on network edges presents a novel approach to network design, providing greater flexibility in modeling complex relationships in data.
- No Linear Weights: elimination of linear weights reduces the parameters, and potentially simplifies the learning process, focusing on the optimization of univariate function representations.
Applications and Practical Use:
- Scientific Collaboration: KANs have been applied in scientific settings as tools to help discover or rediscover math
- "The philosophy here is close to the mindset of physicists, who often care more about typical cases rather than worst cases" lol this went hard on NNs
- "Finite grid size can approximate the function well with a residue rate independent of the dimension, hence beating curse of dimensionality!" haha.
- "Neural scaling laws are the phenomenon where test loss decreases with more model parameters"
- "Our approach, which assumes the existence of smooth Kolmogorov Arnold representations, decomposes the high-dimensional function into several 1D functions"
Key Differences With MLPs:
- Activation Functions: Unlike MLPs that use fixed activation functions at the nodes, KANs utilize learnable activation functions located on the edges between nodes.
- Weight Parameters: In KANs, traditional linear weight matrices are absent. Instead, each weight parameter is replaced by a learnable univariate function, specifically a spline.
- Summation Nodes: Nodes in KANs perform simple summation of incoming signals without applying non-linear transformations.
Advantages Over MLPs:
- Accuracy: achieve higher accuracy with smaller network sizes compared to larger MLPs in tasks like data fitting and solving partial differential equations (PDEs).
- Interpretability: Due to their unique structure, KANs are more interpretable than MLPs.
Technical Innovations:
- Learnable Edges: learnable functions on network edges presents a novel approach to network design, providing greater flexibility in modeling complex relationships in data.
- No Linear Weights: elimination of linear weights reduces the parameters, and potentially simplifies the learning process, focusing on the optimization of univariate function representations.
Applications and Practical Use:
- Scientific Collaboration: KANs have been applied in scientific settings as tools to help discover or rediscover math
posted
an
update
8 months ago
Post
2452
๐ Today's pick in Interpretability & Analysis of LMs: by
@aadityasingh
T. Moskovitz, F. Hill, S. C. Y. Chan, A. M. Saxe (
@gatsbyunit
)
This work proposes a new methodology inspired by optogenetics (dubbed "clamping") to perform targeted ablations during training to estimate the causal effect of specific interventions on mechanism formation.
Authors use this approach to study the formation of induction heads training a 2L attention-only transformer to label examples via context information.
Notable findings:
- The effects of induction heads are additive and redundant, with weaker heads compensating well for the ablation of a strong induction head in case the latter is ablated.
- Competition between induction heads might emerge as a product of optimization pressure to converge faster, but it is not strictly necessary as all heads eventually learn to solve the task.
- Previous token heads (PTH) influence induction heads in a many-to-many fashion, with any PTH eliciting above-chance prediction from a subsequent induction head
- Three subcircuits for induction are identified, respectively mixing token-label information (1 + 2), matching the previous occurrence of the current class in the context (3qk + 4), and copying the label of the matched class (3v + 5).
- The formation of induction heads is slowed down by a larger number of classes & labels, with more classes and more labels slowing down the formation of the matching and copying mechanisms, respectively. This may have implications when selecting a vocabulary size for LLMs: larger vocabularies lead to an increased compression ratio and longer contexts, but they might make copying more challenging by delaying the formation of induction heads.
๐ป Code: https://github.com/aadityasingh/icl-dynamics
๐ Paper: What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation (2404.07129)
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
This work proposes a new methodology inspired by optogenetics (dubbed "clamping") to perform targeted ablations during training to estimate the causal effect of specific interventions on mechanism formation.
Authors use this approach to study the formation of induction heads training a 2L attention-only transformer to label examples via context information.
Notable findings:
- The effects of induction heads are additive and redundant, with weaker heads compensating well for the ablation of a strong induction head in case the latter is ablated.
- Competition between induction heads might emerge as a product of optimization pressure to converge faster, but it is not strictly necessary as all heads eventually learn to solve the task.
- Previous token heads (PTH) influence induction heads in a many-to-many fashion, with any PTH eliciting above-chance prediction from a subsequent induction head
- Three subcircuits for induction are identified, respectively mixing token-label information (1 + 2), matching the previous occurrence of the current class in the context (3qk + 4), and copying the label of the matched class (3v + 5).
- The formation of induction heads is slowed down by a larger number of classes & labels, with more classes and more labels slowing down the formation of the matching and copying mechanisms, respectively. This may have implications when selecting a vocabulary size for LLMs: larger vocabularies lead to an increased compression ratio and longer contexts, but they might make copying more challenging by delaying the formation of induction heads.
๐ป Code: https://github.com/aadityasingh/icl-dynamics
๐ Paper: What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation (2404.07129)
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
replied to
their
post
8 months ago
In-person only, sorry!
posted
an
update
8 months ago
Post
1773
I'm super happy to co-organize the (Mechanistic) Interpretability social at #ICLR2024 with
@nikhil07prakash
! ๐
If you plan to attend, help us make this meetup awesome by filling the form below! ๐
๐ Wed, May 8, 12:45-2:15 PM
๐ RSVP & share your ideas here: https://forms.gle/FWap4KW2ikdntjfb8
If you plan to attend, help us make this meetup awesome by filling the form below! ๐
๐ Wed, May 8, 12:45-2:15 PM
๐ RSVP & share your ideas here: https://forms.gle/FWap4KW2ikdntjfb8
posted
an
update
8 months ago
Post
2384
๐ Today's pick in Interpretability & Analysis of LMs:
LM Transparency Tool: Interactive Tool for Analyzing Transformer
Language Models (2404.07004) by
@igortufanov
@mahnerak
@javifer
@lena-voita
The LLM transparency toolkit is an open source toolkit and visual interface to efficiently identify component circuits in LMs responsible for their predictions, using the Information Flow Routes approach ( Information Flow Routes: Automatically Interpreting Language Models at Scale (2403.00824)).
The tool enables fine-grained customization, highlighting the importance of individual FFN neurons and attention heads. Moreover, vocabulary projections computed using the logit lens approach are provided to examine intermediate predictions of the residual stream, and tokens promoted by specific component updates.
๐ป Code: https://github.com/facebookresearch/llm-transparency-tool
๐ Demo: facebook/llm-transparency-tool-demo
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
The LLM transparency toolkit is an open source toolkit and visual interface to efficiently identify component circuits in LMs responsible for their predictions, using the Information Flow Routes approach ( Information Flow Routes: Automatically Interpreting Language Models at Scale (2403.00824)).
The tool enables fine-grained customization, highlighting the importance of individual FFN neurons and attention heads. Moreover, vocabulary projections computed using the logit lens approach are provided to examine intermediate predictions of the residual stream, and tokens promoted by specific component updates.
๐ป Code: https://github.com/facebookresearch/llm-transparency-tool
๐ Demo: facebook/llm-transparency-tool-demo
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
posted
an
update
9 months ago
Post
2405
๐ Today's pick in Interpretability & Analysis of LMs: x2 edition!
Today's highlighted works aim reproduce findings from Transformer-centric interpretability literature on new RNN-based architectures such as Mamba and RWKV:
Does Transformer Interpretability Transfer to RNNs? (2404.05971) by @MrGonao T. Marshall @norabelrose
Locating and Editing Factual Associations in Mamba (2404.03646) by @sensharma @datkinson @davidbau
The first paper applies contrastive activation addition, the tuned lens and probing for eliciting latent knowledge in quirky models to Mamba and RWKV LMs, finding these Transformer-specific methods can be applied with slight adaptation to these architectures, obtaining similar results.
The second work applies the ROME method to Mamba, finding weights playing the role of MLPs in encoding factual relations across several Mamba layers, and can be patched to perform model editing. A new SSM-specific technique is also introduced to emulate attention knockout (value zeroing) revealing information flows similar to the ones in Transformers when processing factual statements.
๐ป Code: https://github.com/arnab-api/romba
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
Today's highlighted works aim reproduce findings from Transformer-centric interpretability literature on new RNN-based architectures such as Mamba and RWKV:
Does Transformer Interpretability Transfer to RNNs? (2404.05971) by @MrGonao T. Marshall @norabelrose
Locating and Editing Factual Associations in Mamba (2404.03646) by @sensharma @datkinson @davidbau
The first paper applies contrastive activation addition, the tuned lens and probing for eliciting latent knowledge in quirky models to Mamba and RWKV LMs, finding these Transformer-specific methods can be applied with slight adaptation to these architectures, obtaining similar results.
The second work applies the ROME method to Mamba, finding weights playing the role of MLPs in encoding factual relations across several Mamba layers, and can be patched to perform model editing. A new SSM-specific technique is also introduced to emulate attention knockout (value zeroing) revealing information flows similar to the ones in Transformers when processing factual statements.
๐ป Code: https://github.com/arnab-api/romba
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
replied to
their
post
9 months ago
Ah I see, sorry! For the cancellation part, this paragraph explains it:
My understanding is that if the indirect effect of a component is ~equal to the direct one but with opposite direction, then its contribution should be ~0 but risks being non-zero due to approximation errors on the indirect path (if the resulting value is ~0, even very tiny mistakes going through nonlienarities might be blown up). With GradDrop, they basically handle this situation by avoiding taking the difference, and instead estimate the effects of the directs and indirect paths separately
posted
an
update
9 months ago
Post
2230
๐ Today's pick in Interpretability & Analysis of LMs: Context versus Prior Knowledge in Language Models by
@kdu4108
@vesteinn
@niklasstoehr
J. C. White A. Schein
@rcotterell
This work examines the influence of context versus memorized knowledge in LMs through the lens of the shift caused by contexts at various degrees of informativeness to the models' predictive distribution. Understanding this difference is especially important in the context of knowledge conflicts between memorized and contextual information.
Authors propose disentangling context influence in terms of "persuasion", i.e. how impactful is the inclusion of the context for answers of a given query/entity pair, and "susceptibility", i.e. how much answers of a given query/entity pair are likely to be swayed by the presence of context, and operationalize these concepts using information-theoretic measures akin to mutual information.
The two metrics are validated using a synthetic dataset sourced from a knowledge graph. Analysis shows that:โจ
- The degree of persuasiveness of relevant contexts increases with the increase of model size (interesting implications here for the jailbreaking of LLMs!)
- assertive contexts tend to be more persuasive for closed queries (yes/no) and mid-sized models
- Negation affect context persuasiveness
- Familiar entities (explored as real vs. fake, more frequent in training data and more connected in the KG) are less susceptible to context influence
Finally, authors suggest applications of the persuasion/susceptibility framing for social science analyses and gender bias evaluation.
๐ป Code: https://github.com/kdu4108/measureLM
๐ Paper: Context versus Prior Knowledge in Language Models (2404.04633)
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9
This work examines the influence of context versus memorized knowledge in LMs through the lens of the shift caused by contexts at various degrees of informativeness to the models' predictive distribution. Understanding this difference is especially important in the context of knowledge conflicts between memorized and contextual information.
Authors propose disentangling context influence in terms of "persuasion", i.e. how impactful is the inclusion of the context for answers of a given query/entity pair, and "susceptibility", i.e. how much answers of a given query/entity pair are likely to be swayed by the presence of context, and operationalize these concepts using information-theoretic measures akin to mutual information.
The two metrics are validated using a synthetic dataset sourced from a knowledge graph. Analysis shows that:โจ
- The degree of persuasiveness of relevant contexts increases with the increase of model size (interesting implications here for the jailbreaking of LLMs!)
- assertive contexts tend to be more persuasive for closed queries (yes/no) and mid-sized models
- Negation affect context persuasiveness
- Familiar entities (explored as real vs. fake, more frequent in training data and more connected in the KG) are less susceptible to context influence
Finally, authors suggest applications of the persuasion/susceptibility framing for social science analyses and gender bias evaluation.
๐ป Code: https://github.com/kdu4108/measureLM
๐ Paper: Context versus Prior Knowledge in Language Models (2404.04633)
๐ All daily picks: https://huggingface.co./collections/gsarti/daily-picks-in-interpretability-and-analysis-ofc-lms-65ae3339949c5675d25de2f9