metadata

tags:
  - bertopic
library_name: bertopic
pipeline_tag: text-classification

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("ruanwz/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

Number of topics: 30
Number of training documents: 9000

Click here for an overview of all topics.

Topic ID	Topic Keywords	Topic Frequency	Label
-1	tensorflow - pytorch - tf - pretrained - gpu	11	-1_tensorflow_pytorch_tf_pretrained
0	tokenizer - tokenizers - tokenize - tokenization - token	2089	0_tokenizer_tokenizers_tokenize_tokenization
1	gpt2 - gpt - gpt2doubleheadsmodel - gpt2lmheadmodel - distilgpt2	1471	1_gpt2_gpt_gpt2doubleheadsmodel_gpt2lmheadmodel
2	ner - seq2seqtrainer - seq2seq - runseq2seqpy - valueerror	856	2_ner_seq2seqtrainer_seq2seq_runseq2seqpy
3	modelcard - modelcards - card - model - cards	601	3_modelcard_modelcards_card_model
4	trainer - trainertrain - trainers - training - evaluateduringtraining	500	4_trainer_trainertrain_trainers_training
5	longformer - longformers - longformerformultiplechoice - tf - longformertokenizerfast	455	5_longformer_longformers_longformerformultiplechoice_tf
6	typos - typo - fix - correction - fixed	439	6_typos_typo_fix_correction
7	albertbasev2 - albertforpretraining - albert - albertformaskedlm - xlnet	407	7_albertbasev2_albertforpretraining_albert_albertformaskedlm
8	summarization - summaries - summary - text - nlp	351	8_summarization_summaries_summary_text
9	readmemd - readmetxt - readme - modelcard - file	333	9_readmemd_readmetxt_readme_modelcard
10	transformerscli - transformers - transformer - transformerxl - importerror	259	10_transformerscli_transformers_transformer_transformerxl
11	ci - testing - tests - test - slow	228	11_ci_testing_tests_test
12	questionansweringpipeline - questionanswering - answering - tfalbertforquestionanswering - questionasnwering	156	12_questionansweringpipeline_questionanswering_answering_tfalbertforquestionanswering
13	pipeline - pipelines - pipelinespy - pipelineexception - fixpipeline	137	13_pipeline_pipelines_pipelinespy_pipelineexception
14	onnxonnxruntime - onnx - onnxexport - 04onnxexport - 04onnxexportipynb	113	14_onnxonnxruntime_onnx_onnxexport_04onnxexport
15	benchmark - benchmarks - accuracy - evaluation - metrics	98	15_benchmark_benchmarks_accuracy_evaluation
16	huggingfacemaster - huggingfacetokenizers297 - huggingface - huggingfaces - huggingfacetransformers	81	16_huggingfacemaster_huggingfacetokenizers297_huggingface_huggingfaces
17	generationbeamsearchpy - generatebeamsearch - generatebeamsearchoutputs - beamsearch - nonbeamsearch	69	17_generationbeamsearchpy_generatebeamsearch_generatebeamsearchoutputs_beamsearch
18	wav2vec2 - wav2vec - wav2vec20 - wav2vec2forctc - wav2vec2xlrswav2vec2	56	18_wav2vec2_wav2vec_wav2vec20_wav2vec2forctc
19	flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel	53	19_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
20	cachedir - cache - cachedpath - cached - caching	43	20_cachedir_cache_cachedpath_cached
21	notebook - notebooks - colab - community - t5	33	21_notebook_notebooks_colab_community
22	wandbproject - wandb - sagemaker - sagemakertrainer - wandbcallback	32	22_wandbproject_wandb_sagemaker_sagemakertrainer
23	bigbird - py7zr - tapas - tres - v4	32	23_bigbird_py7zr_tapas_tres
24	electra - electrapretrainedmodel - electraformaskedlm - electraformultiplechoice - electrafortokenclassification	28	24_electra_electrapretrainedmodel_electraformaskedlm_electraformultiplechoice
25	layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf	24	25_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
26	isort - blackisortflake8 - github - repo - version	18	26_isort_blackisortflake8_github_repo
27	pplm - pr - deprecated - variable - ppl	14	27_pplm_pr_deprecated_variable
28	blenderbot - blenderbot3b - blenderbotforcausallm - chatbot - boto3	13	28_blenderbot_blenderbot3b_blenderbotforcausallm_chatbot

Training hyperparameters

calculate_probabilities: False
language: english
low_memory: False
min_topic_size: 10
n_gram_range: (1, 1)
nr_topics: 30
seed_topic_list: None
top_n_words: 10
verbose: True

Framework versions

Numpy: 1.23.5
HDBSCAN: 0.8.33
UMAP: 0.5.3
Pandas: 1.5.3
Scikit-Learn: 1.2.2
Sentence-transformers: 2.2.2
Transformers: 4.31.0
Numba: 0.56.4
Plotly: 5.15.0
Python: 3.10.12