Edit model card

open-hdscan-april3

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("Thang203/open-hdscan-april3")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 11
  • Number of training documents: 2779
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 models - language - model - llms - language models 11 -1_models_language_model_llms
0 models - language - model - language models - llms 792 0_models_language_model_language models
1 code - models - llms - language - language models 1082 1_code_models_llms_language
2 models - quantization - model - training - language 311 2_models_quantization_model_training
3 models - bias - text - language - biases 297 3_models_bias_text_language
4 brain - models - language - heads - attention 152 4_brain_models_language_heads
5 hallucinations - hallucination - models - visual - large 34 5_hallucinations_hallucination_models_visual
6 music - audio - poetry - generation - model 31 6_music_audio_poetry_generation
7 financial - analysis - sentiment - investment - large 30 7_financial_analysis_sentiment_investment
8 editing - knowledge - model editing - editing methods - edit 25 8_editing_knowledge_model editing_editing methods
9 materials - molecular - chemical - chemistry - materials science 14 9_materials_molecular_chemical_chemistry

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 11
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.25.2
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.6
  • Pandas: 2.0.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.6.1
  • Transformers: 4.38.2
  • Numba: 0.58.1
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
0
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.