transformers_issues_topics
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("davanstrien/transformers_issues_topics")
topic_model.get_topic_info()
Topic overview
- Number of topics: 30
- Number of training documents: 7235
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | encoder - bert - tensorflow - decoder - output | 11 | -1_encoder_bert_tensorflow_decoder |
0 | tokenizer - tokenizers - tokenization - tokenize - berttokenizer | 2265 | 0_tokenizer_tokenizers_tokenization_tokenize |
1 | cuda - runtimeerror - conda - pytorch - tensorflow | 1513 | 1_cuda_runtimeerror_conda_pytorch |
2 | readmemd - readmetxt - readme - docstring - docstrings | 763 | 2_readmemd_readmetxt_readme_docstring |
3 | trainertrain - trainer - trainertfpy - trainers - training | 550 | 3_trainertrain_trainer_trainertfpy_trainers |
4 | rag - roberta - robertatokenizer - robertatokenizerfast - robertabase | 546 | 4_rag_roberta_robertatokenizer_robertatokenizerfast |
5 | modelcard - modelcards - card - model - cards | 473 | 5_modelcard_modelcards_card_model |
6 | importerror - transformerscli - transformers - transformerxl - transformer | 432 | 6_importerror_transformerscli_transformers_transformerxl |
7 | seq2seq - seq2seqtrainer - seq2seqdataset - runseq2seq - examplesseq2seq | 405 | 7_seq2seq_seq2seqtrainer_seq2seqdataset_runseq2seq |
8 | gpt2 - gpt2tokenizer - gpt2xl - gpt2tokenizerfast - gpt | 365 | 8_gpt2_gpt2tokenizer_gpt2xl_gpt2tokenizerfast |
9 | t5 - t5model - t5base - t5large - tf | 289 | 9_t5_t5model_t5base_t5large |
10 | tests - testing - speedup - test - testgeneratefp16 | 230 | 10_tests_testing_speedup_test |
11 | questionansweringpipeline - questionanswering - answering - questionasnwering - distilbertforquestionanswering | 138 | 11_questionansweringpipeline_questionanswering_answering_questionasnwering |
12 | ner - pipeline - pipelinener - pipelines - pipelineframework | 138 | 12_ner_pipeline_pipelinener_pipelines |
13 | deberta - debertav2 - debertav2initpy - debertatokenizer - distilbertmodel | 132 | 13_deberta_debertav2_debertav2initpy_debertatokenizer |
14 | onnxonnxruntime - onnx - onnxexport - 04onnxexport - 04onnxexportipynb | 110 | 14_onnxonnxruntime_onnx_onnxexport_04onnxexport |
15 | benchmark - benchmarks - accuracy - precision - comparison | 85 | 15_benchmark_benchmarks_accuracy_precision |
16 | labelsmoothingfactor - labelsmoothednllloss - labelsmoothing - labels - label | 79 | 16_labelsmoothingfactor_labelsmoothednllloss_labelsmoothing_labels |
17 | longformer - longformers - longform - longformerforqa - longformerlayer | 71 | 17_longformer_longformers_longform_longformerforqa |
18 | generationbeamsearchpy - generatebeamsearch - beamsearch - nonbeamsearch - beam | 60 | 18_generationbeamsearchpy_generatebeamsearch_beamsearch_nonbeamsearch |
19 | cachedir - cache - cachedpath - caching - cached | 58 | 19_cachedir_cache_cachedpath_caching |
20 | wav2vec2 - wav2vec - wav2vec20 - wav2vec2forctc - wav2vec2xlrswav2vec2 | 56 | 20_wav2vec2_wav2vec_wav2vec20_wav2vec2forctc |
21 | flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel | 52 | 21_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax |
22 | wandbproject - wandb - wandbcallback - wandbdisabled - wandbdisabledtrue | 49 | 22_wandbproject_wandb_wandbcallback_wandbdisabled |
23 | electra - electrapretrainedmodel - electraformaskedlm - electraformultiplechoice - electrafortokenclassification | 38 | 23_electra_electrapretrainedmodel_electraformaskedlm_electraformultiplechoice |
24 | layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf | 24 | 24_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased |
25 | notebook - notebooks - community - text - multilabel | 18 | 25_notebook_notebooks_community_text |
26 | dict - dictstr - returndict - parse - arguments | 18 | 26_dict_dictstr_returndict_parse |
27 | pplm - pr - deprecated - variable - ppl | 17 | 27_pplm_pr_deprecated_variable |
28 | isort - github - repo - version - setupcfg | 15 | 28_isort_github_repo_version |
Training hyperparameters
- calculate_probabilities: False
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: 30
- seed_topic_list: None
- top_n_words: 10
- verbose: True
Framework versions
- Numpy: 1.22.4
- HDBSCAN: 0.8.29
- UMAP: 0.5.3
- Pandas: 1.5.3
- Scikit-Learn: 1.2.2
- Sentence-transformers: 2.2.2
- Transformers: 4.29.2
- Numba: 0.56.4
- Plotly: 5.13.1
- Python: 3.10.11
- Downloads last month
- 114
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.