Stefan Schweter's picture

Stefan Schweter PRO

stefan-it

AI & ML interests

Flair Library 💕, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models

Recent Activity

Organizations

Bayerische Staatsbibliothek's profile picture flair's profile picture Flax Community's profile picture dumitrescustefan-org's profile picture GermanT5's profile picture BigScience: LMs for Historical Texts's profile picture Universal NER's profile picture BigLAM: BigScience Libraries, Archives and Museums's profile picture Libre Euro Lingua-Alliance's profile picture Lang UK's profile picture BabyLM Challenge's profile picture hmByT5 Preliminary's profile picture hmByT5's profile picture Blog-explorers's profile picture German Wikipedia LMs's profile picture hmBERT's profile picture hmTEAMS's profile picture HIPE's profile picture hmBERT Tiny's profile picture hmBERT 64k's profile picture LSV @ Saarland University's profile picture GERMATRON's profile picture PleIAs's profile picture German LLM Tokenizers's profile picture Occiglot's profile picture Social Post Explorers's profile picture GERTuraX's profile picture Stefmal's profile picture Hugging Face Discord Community's profile picture ScaDS.AI German LLM's profile picture ENGEBA's profile picture Nerdy Face's profile picture TensorFlow Model Garden LMs's profile picture

stefan-it's activity

New activity in stefan-it/neobert-ner-conll03 3 days ago

Update model.py

3
#1 opened 6 days ago by
KoichiYasuoka
reacted to csabakecskemeti's post with 🔥 6 days ago
view post
Post
1906
-UPDATED-
4bit inference is working! The blogpost is updated with code snippet and requirements.txt
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
-UPDATED-
I've played around with an MI100 and ROCm and collected my experience in a blogpost:
https://devquasar.com/uncategorized/all-about-amd-and-rocm/
Unfortunately I've could not make inference or training work with model loaded in 8bit or use BnB, but did everything else and documented my findings.
  • 4 replies
·
replied to their post 6 days ago
view reply

Let's see if BERT5urk can make it into @merve 's weekly recap of open AI 🤗

posted an update 6 days ago
view post
Post
768
🇹🇷 😍 I'm very happy to finally announce my new Turkish LM called "BERT5urk":

stefan-it/bert5urk

It is a 1.42B T5-based model, trained with UL2 pretraining objective on the Turkish part of the awesome HuggingFaceFW/fineweb-2 dataset.

Feel free to check it out!
  • 1 reply
·
New activity in stefan-it/bert5urk 6 days ago