KingKazma's picture
Add BERTopic model
65e2d3f
metadata
tags:
  - bertopic
library_name: bertopic
pipeline_tag: text-classification

xsum_22457_3000_1500_test

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("KingKazma/xsum_22457_3000_1500_test")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 10
  • Number of training documents: 1500
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 said - club - player - game - football 10 -1_said_club_player_game
0 said - mr - people - would - year 167 0_said_mr_people_would
1 kick - shot - free - goal - right 1066 1_kick_shot_free_goal
2 midfielder - loan - season - club - league 69 2_midfielder_loan_season_club
3 race - sport - gold - medal - olympic 53 3_race_sport_gold_medal
4 celtic - club - aberdeen - cup - player 39 4_celtic_club_aberdeen_cup
5 england - cricket - wicket - test - captain 33 5_england_cricket_wicket_test
6 cup - wales - rugby - game - fa 33 6_cup_wales_rugby_game
7 wimbledon - player - murray - im - atp 18 7_wimbledon_player_murray_im
8 armstrong - banned - antidoping - rugby - sky 12 8_armstrong_banned_antidoping_rugby

Training hyperparameters

  • calculate_probabilities: True
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: False

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.31.0
  • Numba: 0.57.1
  • Plotly: 5.13.1
  • Python: 3.10.12