I stopped procrastinating and finally took the time to write the second article of my series of blog posts on SSM: https://huggingface.co./blog/lbourdois/ssm-2022. In this blog post, I review the history of SSM models released in 2022, with over 14 models discussed in a synthetic format. They are separated into two parts: "theoretical" (DSS, S4D, GSS, Mega, S5, etc.) and "applications" (Sashimi, ViS4mer, CCNN, etc.).
The most widely used French NER models on HF (Jean-Baptiste/camembert-ner and cmarkea/distilcamembert-base-ner) are trained on a single dataset (WikiNER) which on the one hand contains leaks and therefore distorts the true results of these models, and on the other hand overspecializes them in a particular domain (= texts from Wikipedia). They are also only available in a base version (110M parameters).