Edit model card

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("sweetapplee/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 9000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 bert - tokenizer - tokenizers - pretrained - pytorch 16 -1_bert_tokenizer_tokenizers_pretrained
0 tokenization - tokenizer - tokenizers - token - tokens 2184 0_tokenization_tokenizer_tokenizers_token
1 tf - tpu - t5 - tftrainer - onnx 1761 1_tf_tpu_t5_tftrainer
2 modelcard - modelcards - card - model - cards 939 2_modelcard_modelcards_card_model
3 importerror - attributeerror - valueerror - typeerror - runmlmpy 486 3_importerror_attributeerror_valueerror_typeerror
4 doc - docstring - docstrings - docs - document 449 4_doc_docstring_docstrings_docs
5 albertforpretraining - xlnet - albertbasev2 - albertformaskedlm - xlnetlmheadmodel 400 5_albertforpretraining_xlnet_albertbasev2_albertformaskedlm
6 gpt2 - gpt2tokenizer - gpt2xl - gpt - gpt2tokenizerfast 348 6_gpt2_gpt2tokenizer_gpt2xl_gpt
7 readmemd - readmetxt - readme - file - camembertbasereadmemd 273 7_readmemd_readmetxt_readme_file
8 s2s - s2sdistill - s2t - s2strainer - exampless2s 260 8_s2s_s2sdistill_s2t_s2strainer
9 longformer - longformers - longformerformultiplechoice - longformertokenizerfast - globalattentionmask 216 9_longformer_longformers_longformerformultiplechoice_longformertokenizerfast
10 transformerscli - transformers - transformer - importerror - transformerxl 194 10_transformerscli_transformers_transformer_importerror
11 tests - testing - slow - test - faster 187 11_tests_testing_slow_test
12 cuda - cuda0 - memory - ram - gpus 159 12_cuda_cuda0_memory_ram
13 pipeline - pipelines - ner - nerpipeline - featureextractionpipeline 145 13_pipeline_pipelines_ner_nerpipeline
14 questionansweringpipeline - longformerforquestionanswering - answering - questionanswering - distilbertforquestionanswering 144 14_questionansweringpipeline_longformerforquestionanswering_answering_questionanswering
15 trainertrain - trainer - loggingstrategy - logging - training 139 15_trainertrain_trainer_loggingstrategy_logging
16 benchmark - benchmarks - accuracy - precision - comparison 139 16_benchmark_benchmarks_accuracy_precision
17 labelsmoothednllloss - label - labelsmoothingfactor - labels - labelsmoothing 75 17_labelsmoothednllloss_label_labelsmoothingfactor_labels
18 huggingfacemaster - huggingfacetokenizers297 - huggingface - huggingfaces - huggingfacetransformers 74 18_huggingfacemaster_huggingfacetokenizers297_huggingface_huggingfaces
19 generationbeamsearchpy - generatebeamsearch - beamsearch - nonbeamsearch - beam 73 19_generationbeamsearchpy_generatebeamsearch_beamsearch_nonbeamsearch
20 wav2vec2 - wav2vec - wav2vec20 - wav2vec2forctc - wav2vec2xlrswav2vec2 59 20_wav2vec2_wav2vec_wav2vec20_wav2vec2forctc
21 flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel 52 21_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
22 notebook - notebooks - notebookprogresscallback - community - colab 51 22_notebook_notebooks_notebookprogresscallback_community
23 wandbproject - wandb - wandbcallback - wandbdisabled - wandbdisabledtrue 40 23_wandbproject_wandb_wandbcallback_wandbdisabled
24 cachedir - cache - cachedpath - caching - cached 34 24_cachedir_cache_cachedpath_caching
25 closed - add - bort - added - deleted 33 25_closed_add_bort_added
26 electra - electrapretrainedmodel - electraformaskedlm - electralarge - electraformultiplechoice 26 26_electra_electrapretrainedmodel_electraformaskedlm_electralarge
27 layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf 26 27_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
28 isort - blackisortflake8 - github - repo - version 18 28_isort_blackisortflake8_github_repo

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True

Framework versions

  • Numpy: 1.23.5
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.4
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.34.1
  • Numba: 0.56.4
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.