cnn_dailymail_22457_3000_1500_test

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/cnn_dailymail_22457_3000_1500_test")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 9
  • Number of training documents: 1500
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 mccoy - jockey - ap - champion - winner 15 -1_mccoy_jockey_ap_champion
0 said - one - year - also - told 9 0_said_one_year_also
1 league - season - player - goal - game 994 1_league_season_player_goal
2 labour - mr - said - miliband - leader 290 2_labour_mr_said_miliband
3 race - hamilton - rosberg - mercedes - marathon 84 3_race_hamilton_rosberg_mercedes
4 england - cricket - test - pietersen - anderson 32 4_england_cricket_test_pietersen
5 ncaa - first - game - college - basketball 30 5_ncaa_first_game_college
6 masters - spieth - augusta - hole - round 28 6_masters_spieth_augusta_hole
7 mayweather - fight - pacquiao - boxing - vegas 18 7_mayweather_fight_pacquiao_boxing

Training hyperparameters

  • calculate_probabilities: True
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.31.0
  • Numba: 0.56.4
  • Plotly: 5.13.1
  • Python: 3.10.6
Downloads last month
9
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.