metadata

tags:
  - bertopic
library_name: bertopic
pipeline_tag: text-classification

transformers_issues_topics

This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

Usage

To use this model, please install BERTopic:

pip install -U bertopic

You can use the model as follows:

from bertopic import BERTopic
topic_model = BERTopic.load("asoria/transformers_issues_topics")

topic_model.get_topic_info()

Topic overview

Number of topics: 30
Number of training documents: 9000

Click here for an overview of all topics.

Topic ID	Topic Keywords	Topic Frequency	Label
-1	pytorch - tensorflow - bert - tf - pretrained	15	-1_pytorch_tensorflow_bert_tf
0	bert - bertforsequenceclassification - berttokenizer - bart - batchencodeplus	2321	0_bert_bertforsequenceclassification_berttokenizer_bart
1	cuda - memory - trainertrain - tensorflow - trainer	1554	1_cuda_memory_trainertrain_tensorflow
2	transformerscli - transformers - transformer - importerror - transformerxl	882	2_transformerscli_transformers_transformer_importerror
3	modelcard - modelcards - card - model - models	490	3_modelcard_modelcards_card_model
4	gpt2 - gpt2tokenizer - gpt2xl - gpt2tokenizerfast - gpt2model	462	4_gpt2_gpt2tokenizer_gpt2xl_gpt2tokenizerfast
5	attributeerror - typeerror - valueerror - runtimeerror - indexerror	437	5_attributeerror_typeerror_valueerror_runtimeerror
6	typos - typo - doc - docstring - fix	336	6_typos_typo_doc_docstring
7	t5 - t5model - t5base - tf - t5large	298	7_t5_t5model_t5base_tf
8	readmemd - readmetxt - readme - modelcard - file	270	8_readmemd_readmetxt_readme_modelcard
9	ci - testing - tests - test - speedup	254	9_ci_testing_tests_test
10	s2s - s2sdistill - s2t - s2strainer - exampless2s	245	10_s2s_s2sdistill_s2t_s2strainer
11	glue - gluepy - glueconvertexamplestofeatures - roberta - huggingfacetransformers	214	11_glue_gluepy_glueconvertexamplestofeatures_roberta
12	ner - pipeline - pipelines - nerpipeline - fillmaskpipeline	158	12_ner_pipeline_pipelines_nerpipeline
13	rag - ragtokenforgeneration - ragsequenceforgeneration - clean - tests	153	13_rag_ragtokenforgeneration_ragsequenceforgeneration_clean
14	questionansweringpipeline - questionanswering - answering - tfalbertforquestionanswering - questionasnwering	143	14_questionansweringpipeline_questionanswering_answering_tfalbertforquestionanswering
15	onnx - 04onnxexport - 04onnxexportipynb - aionnx - sphynx	131	15_onnx_04onnxexport_04onnxexportipynb_aionnx
16	longformer - longformers - longform - longformerlayer - longformermodel	104	16_longformer_longformers_longform_longformerlayer
17	labelsmoothednllloss - label - labelsmoothingfactor - labels - labelsmoothing	76	17_labelsmoothednllloss_label_labelsmoothingfactor_labels
18	benchmark - benchmarking - benchmarks - accuracy - evaluation	73	18_benchmark_benchmarking_benchmarks_accuracy
19	wav2vec2 - wav2vec - wav2vec20 - wav2vec2forctc - wav2vec2xlrswav2vec2	67	19_wav2vec2_wav2vec_wav2vec20_wav2vec2forctc
20	flax - flaxelectraformaskedlm - flaxelectraforpretraining - flaxjax - flaxelectramodel	51	20_flax_flaxelectraformaskedlm_flaxelectraforpretraining_flaxjax
21	configpath - configs - config - configuration - modelconfigs	49	21_configpath_configs_config_configuration
22	logging - logs - log - logger - loghistory	40	22_logging_logs_log_logger
23	cachedir - cache - cachedpath - caching - cached	38	23_cachedir_cache_cachedpath_caching
24	wandbproject - wandb - sagemaker - sagemakertrainer - wandbcallback	36	24_wandbproject_wandb_sagemaker_sagemakertrainer
25	notebook - notebooks - community - colab - t5	33	25_notebook_notebooks_community_colab
26	electra - electrapretrainedmodel - electraformaskedlm - electraformultiplechoice - electrafortokenclassification	30	26_electra_electrapretrainedmodel_electraformaskedlm_electraformultiplechoice
27	layoutlm - layout - layoutlmtokenizer - layoutlmbaseuncased - tf	25	27_layoutlm_layout_layoutlmtokenizer_layoutlmbaseuncased
28	pplm - pr - deprecated - variable - ppl	15	28_pplm_pr_deprecated_variable

Training hyperparameters

calculate_probabilities: False
language: english
low_memory: False
min_topic_size: 10
n_gram_range: (1, 1)
nr_topics: 30
seed_topic_list: None
top_n_words: 10
verbose: True
zeroshot_min_similarity: 0.7
zeroshot_topic_list: None

Framework versions

Numpy: 1.26.4
HDBSCAN: 0.8.38.post1
UMAP: 0.5.6
Pandas: 2.1.4
Scikit-Learn: 1.5.2
Sentence-transformers: 3.1.1
Transformers: 4.44.2
Numba: 0.60.0
Plotly: 5.24.1
Python: 3.10.12