Edit model card

1M_bert

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Jellywibble/1M_bert")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 20
  • Number of training documents: 1000000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
0 you - the - to - user - bot 201385 0_you_the_to_user
1 his - he - you - your - bot 182561 1_his_he_you_your
2 her - she - you - to - user 99643 2_her_she_you_to
3 his - he - you - to - the 97803 3_his_he_you_to
4 her - she - you - user - your 55073 4_her_she_you_user
5 his - he - you - to - the 49339 5_his_he_you_to
6 her - she - to - the - you 48030 6_her_she_to_the
7 his - he - you - the - to 46294 7_his_he_you_the
8 the - you - to - and - user 40228 8_the_you_to_and
9 his - he - you - the - to 35152 9_his_he_you_the
10 his - he - you - to - the 23293 10_his_he_you_to
11 his - he - you - the - to 23158 11_his_he_you_the
12 you - hoshi - to - her - the 18615 12_you_hoshi_to_her
13 his - jack - he - elijah - you 16377 13_his_jack_he_elijah
14 his - he - you - the - to 12291 14_his_he_you_the
15 tom - his - he - you - to 11575 15_tom_his_he_you
16 his - he - max - you - the 11396 16_his_he_max_you
17 liam - damien - he - his - you 9937 17_liam_damien_he_his
18 felix - his - he - the - you 9730 18_felix_his_he_the
19 ghost - his - he - you - the 8120 19_ghost_his_he_you

Training hyperparameters

  • calculate_probabilities: False
  • language: multilingual
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: None
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.23.1
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.5
  • Pandas: 1.3.5
  • Scikit-Learn: 1.3.2
  • Sentence-transformers: 2.3.1
  • Transformers: 4.37.2
  • Numba: 0.58.1
  • Plotly: 5.19.0
  • Python: 3.8.10
Downloads last month
2
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.