Edit model card

1M_bert

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Jellywibble/1M_bert")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 20
  • Number of training documents: 1000000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
0 you - the - to - user - bot 201385 0_you_the_to_user
1 his - he - you - your - bot 182561 1_his_he_you_your
2 her - she - you - to - user 99643 2_her_she_you_to
3 his - he - you - to - the 97803 3_his_he_you_to
4 her - she - you - user - your 55073 4_her_she_you_user
5 his - he - you - to - the 49339 5_his_he_you_to
6 her - she - to - the - you 48030 6_her_she_to_the
7 his - he - you - the - to 46294 7_his_he_you_the
8 the - you - to - and - user 40228 8_the_you_to_and
9 his - he - you - the - to 35152 9_his_he_you_the
10 his - he - you - to - the 23293 10_his_he_you_to
11 his - he - you - the - to 23158 11_his_he_you_the
12 you - hoshi - to - her - the 18615 12_you_hoshi_to_her
13 his - jack - he - elijah - you 16377 13_his_jack_he_elijah
14 his - he - you - the - to 12291 14_his_he_you_the
15 tom - his - he - you - to 11575 15_tom_his_he_you
16 his - he - max - you - the 11396 16_his_he_max_you
17 liam - damien - he - his - you 9937 17_liam_damien_he_his
18 felix - his - he - the - you 9730 18_felix_his_he_the
19 ghost - his - he - you - the 8120 19_ghost_his_he_you

Training hyperparameters

  • calculate_probabilities: False
  • language: multilingual
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.23.1
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.5
  • Pandas: 1.3.5
  • Scikit-Learn: 1.3.2
  • Sentence-transformers: 2.3.1
  • Transformers: 4.37.2
  • Numba: 0.58.1
  • Plotly: 5.19.0
  • Python: 3.8.10
Downloads last month
1