sb_clustering_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Thabet/sb_clustering_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 40
  • Number of training documents: 1636
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 jt - actu - fte - ville - 57 11 -1_jt_actu_fte_ville
0 vef - invit - invite - portrait - diff 613 0_vef_invit_invite_portrait
1 foot - football - ligue - fc - match 62 1_foot_football_ligue_fc
2 festival - jazz - dition - pommiers - jazz pommiers 59 2_festival_jazz_dition_pommiers
3 renvoi - college - collge - lyce - lcole 59 3_renvoi_college_collge_lyce
4 tribunal - proces - procs - affaire - permis conduire 56 4_tribunal_proces_procs_affaire
5 tourisme - weekend - ascension - lascension - weekend ascension 48 5_tourisme_weekend_ascension_lascension
6 urgences - non - soignants non - soignants - non vaccins 48 6_urgences_non_soignants non_soignants
7 muse - expo - chteau - chateau - monument 44 7_muse_expo_chteau_chateau
8 eau - deau - leau - eaux - qualite 42 8_eau_deau_leau_eaux
9 culture - teaser chronique - chronique - teaser - jour 39 9_culture_teaser chronique_chronique_teaser
10 homophobie - contre - lgbt - contre lhomophobie - lhomophobie 36 10_homophobie_contre_lgbt_contre lhomophobie
11 basket - d69 basket - asvel - fminin villeneuve - finale 33 11_basket_d69 basket_asvel_fminin villeneuve
12 rugby - mont marsan - marsan - dublin - finale dublin 31 12_rugby_mont marsan_marsan_dublin
13 roues folie - roues - srie roues - folie - srie 29 13_roues folie_roues_srie roues_folie
14 grve - sncf - brve - sncf dijon - breve 27 14_grve_sncf_brve_sncf dijon
15 rue - rue pierre - parking - mauroy - pierre mauroy 26 15_rue_rue pierre_parking_mauroy
16 ouvrier - ouvrier france - serie - france - meilleur ouvrier 23 16_ouvrier_ouvrier france_serie_france
17 feux - agricoles - vols - agriculteurs - vols gps 23 17_feux_agricoles_vols_agriculteurs
18 vertbaudet - centre - commerants - commerce - centreville 22 18_vertbaudet_centre_commerants_commerce
19 archives - policier - policiers - congrs ps - politique 22 19_archives_policier_policiers_congrs ps
20 recyclage - made in - made - transforme - carton 22 20_recyclage_made in_made_transforme
21 cvdl - trail - routes - route - cvdl invite 21 21_cvdl_trail_routes_route
22 sniors - ans - dune - maison - secondaires 20 22_sniors_ans_dune_maison
23 sports - sport - loc sport - loc - aim 20 23_sports_sport_loc sport_loc
24 cannes - festival cannes - festival - cannes festival - d06 18 24_cannes_festival cannes_festival_cannes festival
25 maires - maire - dmission - maire veyrac - dep dmission 17 25_maires_maire_dmission_maire veyrac
26 solidaire - bo - bouquinerie solidaire - rdvcv - rdvcv bo 16 26_solidaire_bo_bouquinerie solidaire_rdvcv
27 armada - vins - mer - larmada - maritime 15 27_armada_vins_mer_larmada
28 accident - accident mortel - mortel - fayssal - mortel minibus 15 28_accident_accident mortel_mortel_fayssal
29 dunkerque - jours dunkerque - jours - tape - dunkerque tape 14 29_dunkerque_jours dunkerque_jours_tape
30 armes anciennes - participants - twirling bton - twirling - convention 13 30_armes anciennes_participants_twirling bton_twirling
31 cpop - carmina - eyes - planete - savaoo application 12 31_cpop_carmina_eyes_planete
32 secheresse - scurit - levage - limplantation - poules salmagne 12 32_secheresse_scurit_levage_limplantation
33 bio - aides - d86 - hugues bioret - dossier presse 12 33_bio_aides_d86_hugues bioret
34 collectif - camping - collecte - infimiers libraux - sr d51 12 34_collectif_camping_collecte_infimiers libraux
35 grand prix - prix - grand - pau - race 11 35_grand prix_prix_grand_pau
36 boom - technique - entreprise - emploi open - futurs 11 36_boom_technique_entreprise_emploi open
37 championnat - escrime - 57 - championnat escrime - 57 championnat 11 37_championnat_escrime_57_championnat escrime
38 oiseaux - population - oiseaux lpo - lpo - gl69 oorion 11 38_oiseaux_population_oiseaux lpo_lpo

Training hyperparameters

  • calculate_probabilities: True
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.23.5
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.31.0
  • Numba: 0.56.4
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.