MARTINI_enrich_BERTopic_GeneralMCNews
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("AIDA-UPM/MARTINI_enrich_BERTopic_GeneralMCNews")
topic_model.get_topic_info()
Topic overview
- Number of topics: 38
- Number of training documents: 3780
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | biden - fbi - states - vaccine - ukraine | 20 | -1_biden_fbi_states_vaccine |
0 | ballots - maricopa - rigged - recount - republican | 1715 | 0_ballots_maricopa_rigged_recount |
1 | gaza - netanyahu - jerusalem - airstrikes - terrorists | 198 | 1_gaza_netanyahu_jerusalem_airstrikes |
2 | gunman - victims - active - nypd - nashville | 149 | 2_gunman_victims_active_nypd |
3 | twitter - dorsey - banned - paypal - starlink | 139 | 3_twitter_dorsey_banned_paypal |
4 | pelosi - congressman - mccarthy - republicans - impeachment | 111 | 4_pelosi_congressman_mccarthy_republicans |
5 | vaccines - paxlovid - myocarditis - mrna - transfusion | 101 | 5_vaccines_paxlovid_myocarditis_mrna |
6 | biden - fbi - whistleblowers - bribery - subpoena | 87 | 6_biden_fbi_whistleblowers_bribery |
7 | zelensky - ukrainians - volodymyr - belarus - medvedev | 75 | 7_zelensky_ukrainians_volodymyr_belarus |
8 | blackouts - electricity - shortages - california - surging | 71 | 8_blackouts_electricity_shortages_california |
9 | arrested - trafficking - rapist - indicted - investigators | 69 | 9_arrested_trafficking_rapist_indicted |
10 | bolsonaro - petrobras - paulo - santos - janeiro | 64 | 10_bolsonaro_petrobras_paulo_santos |
11 | doj - subpoenaed - declassified - bannon - dismissed | 63 | 11_doj_subpoenaed_declassified_bannon |
12 | jpmorgan - billionaire - jeffrey - ghislaine - zuckerman | 57 | 12_jpmorgan_billionaire_jeffrey_ghislaine |
13 | misgendered - lgbtq - minors - school - genitals | 55 | 13_misgendered_lgbtq_minors_school |
14 | pandemics - nipah - influenza - h5n1 - poliovirus | 49 | 14_pandemics_nipah_influenza_h5n1 |
15 | kardashian - balenciaga - megyn - skky - milo | 46 | 15_kardashian_balenciaga_megyn_skky |
16 | subliminal - satanists - pentagram - reptilian - symbolism | 46 | 16_subliminal_satanists_pentagram_reptilian |
17 | imran - islamabad - peshawar - faizabad - overthrown | 46 | 17_imran_islamabad_peshawar_faizabad |
18 | fires - explosion - haverstraw - warehouse - massive | 45 | 18_fires_explosion_haverstraw_warehouse |
19 | migrants - border - texas - yuma - smuggling | 45 | 19_migrants_border_texas_yuma |
20 | ufo - norad - airships - spyballoon - surveillance | 44 | 20_ufo_norad_airships_spyballoon |
21 | additives - tyson - chicken - mcdonald - antibiotics | 43 | 21_additives_tyson_chicken_mcdonald |
22 | taiwan - pelosi - spratly - squadrons - pingtan | 40 | 22_taiwan_pelosi_spratly_squadrons |
23 | china - lockdowns - shenzhen - pcr - robotaxi | 37 | 23_china_lockdowns_shenzhen_pcr |
24 | climate - propagandized - experts - robodogs - honeybee | 36 | 24_climate_propagandized_experts_robodogs |
25 | bancorporation - fdic - depositors - plummet - crisis | 35 | 25_bancorporation_fdic_depositors_plummet |
26 | unvaxxed - novavax - mandates - reinstated - plaintiffs | 34 | 26_unvaxxed_novavax_mandates_reinstated |
27 | derailment - hazmat - ohio - dioxin - spills | 32 | 27_derailment_hazmat_ohio_dioxin |
28 | biden - psaki - dictator - deepfaked - granddaughter | 30 | 28_biden_psaki_dictator_deepfaked |
29 | fauci - remdesivir - usaid - rfk - collusion | 29 | 29_fauci_remdesivir_usaid_rfk |
30 | indicted - trump - prosecutorial - manhattan - jurors | 29 | 30_indicted_trump_prosecutorial_manhattan |
31 | brics - rubles - rupee - currencies - yuan | 28 | 31_brics_rubles_rupee_currencies |
32 | zaporizhzhya - donetsk - mykolaiv - nuclear - kryvyi | 28 | 32_zaporizhzhya_donetsk_mykolaiv_nuclear |
33 | aircraft - crashed - landed - turbulence - runway | 22 | 33_aircraft_crashed_landed_turbulence |
34 | jfk - assassinating - snowden - julian - mossad | 21 | 34_jfk_assassinating_snowden_julian |
35 | zuckerberg - meta - instagram - lawsuit - shareholder | 21 | 35_zuckerberg_meta_instagram_lawsuit |
36 | desantis - governor - bush - gitmo - prosecute | 20 | 36_desantis_governor_bush_gitmo |
Training hyperparameters
- calculate_probabilities: True
- language: None
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: None
- seed_topic_list: None
- top_n_words: 10
- verbose: False
- zeroshot_min_similarity: 0.7
- zeroshot_topic_list: None
Framework versions
- Numpy: 1.26.4
- HDBSCAN: 0.8.40
- UMAP: 0.5.7
- Pandas: 2.2.3
- Scikit-Learn: 1.5.2
- Sentence-transformers: 3.3.1
- Transformers: 4.46.3
- Numba: 0.60.0
- Plotly: 5.24.1
- Python: 3.10.12
- Downloads last month
- 5
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.