Edit model card

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("mark230271/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 9000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 tokenizer - bert - tokenizers - pytorch - tensorflow 11 -1_tokenizer_bert_tokenizers_pytorch
0 tokenizer - tokenizers - tokenization - berttokenizer - bart 2376 0_tokenizer_tokenizers_tokenization_berttokenizer
1 cuda - gpt2 - gpt - gpus - gpu 1879 1_cuda_gpt2_gpt_gpus
2 modelcard - modelcards - card - model - models 735 2_modelcard_modelcards_card_model
3 transformerscli - transformers - transformer - transformerxl - importerror 412 3_transformerscli_transformers_transformer_transformerxl
4 typeerror - attributeerror - valueerror - error - errors 385 4_typeerror_attributeerror_valueerror_error
5 trainertrain - trainer - trainerevaluate - trainers - training 330 5_trainertrain_trainer_trainerevaluate_trainers
6 seq2seq - seq2seqtrainer - s2s - runseq2seq - seq2seqdataset 319 6_seq2seq_seq2seqtrainer_s2s_runseq2seq
7 typos - typo - fix - correction - fixed 306 7_typos_typo_fix_correction
8 ci - testing - test - tests - circleci 282 8_ci_testing_test_tests
9 readmemd - readmetxt - readme - file - camembertbasereadmemd 255 9_readmemd_readmetxt_readme_file
10 t5 - t5model - tf - t5base - t5large 255 10_t5_t5model_tf_t5base
11 generationbeamsearchpy - beamsearch - groupbeamsearch - beam - search 218 11_generationbeamsearchpy_beamsearch_groupbeamsearch_beam
12 flax - distilbertmodel - flaubert - deberta - model 185 12_flax_distilbertmodel_flaubert_deberta
13 ner - pipeline - pipelines - nerpipeline - fillmaskpipeline 177 13_ner_pipeline_pipelines_nerpipeline
14 questionansweringpipeline - tfalbertforquestionanswering - questionanswering - distilbertforquestionanswering - answering 161 14_questionansweringpipeline_tfalbertforquestionanswering_questionanswering_distilbertforquestionanswering
15 huggingfacetransformers - huggingface - hugging - gluepy - gluebenchmarkcom 133 15_huggingfacetransformers_huggingface_hugging_gluepy
16 onnx - onnxonnxruntime - onnxexport - 04onnxexport - 04onnxexportipynb 130 16_onnx_onnxonnxruntime_onnxexport_04onnxexport
17 labelsmoothednllloss - labelsmoothingfactor - label - labels - labelsmoothing 96 17_labelsmoothednllloss_labelsmoothingfactor_label_labels
18 longformer - longformers - longform - longformerlayer - longformermodel 73 18_longformer_longformers_longform_longformerlayer
19 configpath - configs - config - configuration - modelconfigs 59 19_configpath_configs_config_configuration
20 wandbproject - wandb - sagemaker - sagemakertrainer - wandbcallback 45 20_wandbproject_wandb_sagemaker_sagemakertrainer
21 cachedir - cache - cachedpath - caching - cached 33 21_cachedir_cache_cachedpath_caching
22 notebook - notebooks - community - colab - t5 33 22_notebook_notebooks_community_colab
23 electra - electrapretrainedmodel - electraformaskedlm - electraformultiplechoice - electrafortokenclassification 30 23_electra_electrapretrainedmodel_electraformaskedlm_electraformultiplechoice
24 layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf 24 24_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
25 isort - blackisortflake8 - github - repo - version 18 25_isort_blackisortflake8_github_repo
26 pplm - pr - deprecated - variable - ppl 14 26_pplm_pr_deprecated_variable
27 indexerror - index - missingindex - indices - runtimeerror 14 27_indexerror_index_missingindex_indices
28 ga - fork - forks - forked - push 12 28_ga_fork_forks_forked

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True
  • zeroshot_min_similarity: 0.7
  • zeroshot_topic_list: None

Framework versions

  • Numpy: 1.25.2
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.6
  • Pandas: 2.0.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.6.1
  • Transformers: 4.38.2
  • Numba: 0.58.1
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
3
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.