Edit model card

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("lbhjvh14/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

  • Number of topics: 30
  • Number of training documents: 9000
Click here for an overview of all topics.
Topic ID Topic Keywords Topic Frequency Label
-1 pretrained - tokenizer - tensorflow - tokenizers - tf 11 -1_pretrained_tokenizer_tensorflow_tokenizers
0 tokenizer - tokenizers - tokenization - tokenize - token 2407 0_tokenizer_tokenizers_tokenization_tokenize
1 cuda - memory - tensorflow - pytorch - gpu 1379 1_cuda_memory_tensorflow_pytorch
2 longformer - longformers - longformertokenizerfast - longformerformultiplechoice - tf 791 2_longformer_longformers_longformertokenizerfast_longformerformultiplechoice
3 modelcard - modelcards - card - model - cards 510 3_modelcard_modelcards_card_model
4 summarization - summaries - summary - sentences - text 431 4_summarization_summaries_summary_sentences
5 s2s - seq2seq - runseq2seq - eval - examplesseq2seq 405 5_s2s_seq2seq_runseq2seq_eval
6 squaddataset - attributeerror - squadpy - valueerror - modulenotfounderror 381 6_squaddataset_attributeerror_squadpy_valueerror
7 typos - typo - doc - docstring - fix 324 7_typos_typo_doc_docstring
8 readmemd - readmetxt - readme - modelcard - file 299 8_readmemd_readmetxt_readme_modelcard
9 gpt2 - gpt2xl - gpt - gpt2tokenizer - gpt3 261 9_gpt2_gpt2xl_gpt_gpt2tokenizer
10 rag - ragtokenforgeneration - ragsequenceforgeneration - tokenizer - gluepy 256 10_rag_ragtokenforgeneration_ragsequenceforgeneration_tokenizer
11 transformerscli - importerror - transformers - transformer - transformerxl 232 11_transformerscli_importerror_transformers_transformer
12 ner - pipeline - pipelines - pipelinespy - nerpipeline 196 12_ner_pipeline_pipelines_pipelinespy
13 testing - tests - test - installationtest - speedup 189 13_testing_tests_test_installationtest
14 checkpoint - trainertrain - checkpoints - checkpointing - trainersavecheckpoint 162 14_checkpoint_trainertrain_checkpoints_checkpointing
15 flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel 119 15_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
16 generationbeamsearchpy - generatebeamsearch - beamsearch - nonbeamsearch - beam 109 16_generationbeamsearchpy_generatebeamsearch_beamsearch_nonbeamsearch
17 onnxonnxruntime - onnx - onnxexport - 04onnxexport - 04onnxexportipynb 99 17_onnxonnxruntime_onnx_onnxexport_04onnxexport
18 labelsmoothednllloss - labelsmoothingfactor - label - labels - labelsmoothing 86 18_labelsmoothednllloss_labelsmoothingfactor_label_labels
19 cachedir - cache - cachedpath - cached - caching 77 19_cachedir_cache_cachedpath_cached
20 wav2vec2 - wav2vec - wav2vec20 - wav2vec2forctc - wav2vec2xlrswav2vec2 61 20_wav2vec2_wav2vec_wav2vec20_wav2vec2forctc
21 notebook - notebooks - community - colab - t5 55 21_notebook_notebooks_community_colab
22 wandbproject - wandb - wandbcallback - wandbdisabled - wandbdisabledtrue 39 22_wandbproject_wandb_wandbcallback_wandbdisabled
23 electra - electrapretrainedmodel - electraformaskedlm - electraformultiplechoice - electrafortokenclassification 38 23_electra_electrapretrainedmodel_electraformaskedlm_electraformultiplechoice
24 layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf 24 24_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
25 isort - blackisortflake8 - dependencies - github - matplotlib 18 25_isort_blackisortflake8_dependencies_github
26 pplm - pr - deprecated - variable - ppl 16 26_pplm_pr_deprecated_variable
27 ga - fork - forks - forked - push 14 27_ga_fork_forks_forked
28 indexerror - runtimeerror - index - indices - missingindex 11 28_indexerror_runtimeerror_index_indices

Training hyperparameters

  • calculate_probabilities: False
  • language: english
  • low_memory: False
  • min_topic_size: 10
  • n_gram_range: (1, 1)
  • nr_topics: 30
  • seed_topic_list: None
  • top_n_words: 10
  • verbose: True

Framework versions

  • Numpy: 1.23.5
  • HDBSCAN: 0.8.33
  • UMAP: 0.5.4
  • Pandas: 1.5.3
  • Scikit-Learn: 1.2.2
  • Sentence-transformers: 2.2.2
  • Transformers: 4.34.0
  • Numba: 0.56.4
  • Plotly: 5.15.0
  • Python: 3.10.12
Downloads last month
1
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.