Edit model card

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("davanstrien/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 7235
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 encoder - bert - tensorflow - decoder - output 11 -1_encoder_bert_tensorflow_decoder
0 tokenizer - tokenizers - tokenization - tokenize - berttokenizer 2265 0_tokenizer_tokenizers_tokenization_tokenize
1 cuda - runtimeerror - conda - pytorch - tensorflow 1513 1_cuda_runtimeerror_conda_pytorch
2 readmemd - readmetxt - readme - docstring - docstrings 763 2_readmemd_readmetxt_readme_docstring
3 trainertrain - trainer - trainertfpy - trainers - training 550 3_trainertrain_trainer_trainertfpy_trainers
4 rag - roberta - robertatokenizer - robertatokenizerfast - robertabase 546 4_rag_roberta_robertatokenizer_robertatokenizerfast
5 modelcard - modelcards - card - model - cards 473 5_modelcard_modelcards_card_model
6 importerror - transformerscli - transformers - transformerxl - transformer 432 6_importerror_transformerscli_transformers_transformerxl
7 seq2seq - seq2seqtrainer - seq2seqdataset - runseq2seq - examplesseq2seq 405 7_seq2seq_seq2seqtrainer_seq2seqdataset_runseq2seq
8 gpt2 - gpt2tokenizer - gpt2xl - gpt2tokenizerfast - gpt 365 8_gpt2_gpt2tokenizer_gpt2xl_gpt2tokenizerfast
9 t5 - t5model - t5base - t5large - tf 289 9_t5_t5model_t5base_t5large
10 tests - testing - speedup - test - testgeneratefp16 230 10_tests_testing_speedup_test
11 questionansweringpipeline - questionanswering - answering - questionasnwering - distilbertforquestionanswering 138 11_questionansweringpipeline_questionanswering_answering_questionasnwering
12 ner - pipeline - pipelinener - pipelines - pipelineframework 138 12_ner_pipeline_pipelinener_pipelines
13 deberta - debertav2 - debertav2initpy - debertatokenizer - distilbertmodel 132 13_deberta_debertav2_debertav2initpy_debertatokenizer
14 onnxonnxruntime - onnx - onnxexport - 04onnxexport - 04onnxexportipynb 110 14_onnxonnxruntime_onnx_onnxexport_04onnxexport
15 benchmark - benchmarks - accuracy - precision - comparison 85 15_benchmark_benchmarks_accuracy_precision
16 labelsmoothingfactor - labelsmoothednllloss - labelsmoothing - labels - label 79 16_labelsmoothingfactor_labelsmoothednllloss_labelsmoothing_labels
17 longformer - longformers - longform - longformerforqa - longformerlayer 71 17_longformer_longformers_longform_longformerforqa
18 generationbeamsearchpy - generatebeamsearch - beamsearch - nonbeamsearch - beam 60 18_generationbeamsearchpy_generatebeamsearch_beamsearch_nonbeamsearch
19 cachedir - cache - cachedpath - caching - cached 58 19_cachedir_cache_cachedpath_caching
20 wav2vec2 - wav2vec - wav2vec20 - wav2vec2forctc - wav2vec2xlrswav2vec2 56 20_wav2vec2_wav2vec_wav2vec20_wav2vec2forctc
21 flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel 52 21_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
22 wandbproject - wandb - wandbcallback - wandbdisabled - wandbdisabledtrue 49 22_wandbproject_wandb_wandbcallback_wandbdisabled
23 electra - electrapretrainedmodel - electraformaskedlm - electraformultiplechoice - electrafortokenclassification 38 23_electra_electrapretrainedmodel_electraformaskedlm_electraformultiplechoice
24 layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf 24 24_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
25 notebook - notebooks - community - text - multilabel 18 25_notebook_notebooks_community_text
26 dict - dictstr - returndict - parse - arguments 18 26_dict_dictstr_returndict_parse
27 pplm - pr - deprecated - variable - ppl 17 27_pplm_pr_deprecated_variable
28 isort - github - repo - version - setupcfg 15 28_isort_github_repo_version

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True

Framework versions

  • Numpy: 1.22.4
  • HDBSCAN: 0.8.29
  • UMAP: 0.5.3
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.29.2
  • Numba: 0.56.4
  • Plotly: 5.13.1
  • Python: 3.10.11
Downloads last month
52
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using davanstrien/transformers_issues_topics 1