|
|
|
--- |
|
tags: |
|
- bertopic |
|
library_name: bertopic |
|
pipeline_tag: text-classification |
|
--- |
|
|
|
# BERTopic_gregoryroose |
|
|
|
This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. |
|
BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. |
|
|
|
## Usage |
|
|
|
To use this model, please install BERTopic: |
|
|
|
``` |
|
pip install -U bertopic |
|
``` |
|
|
|
You can use the model as follows: |
|
|
|
```python |
|
from bertopic import BERTopic |
|
topic_model = BERTopic.load("sdantonio/BERTopic_gregoryroose") |
|
|
|
topic_model.get_topic_info() |
|
``` |
|
|
|
## Topic overview |
|
|
|
* Number of topics: 34 |
|
* Number of training documents: 5019 |
|
|
|
<details> |
|
<summary>Click here for an overview of all topics.</summary> |
|
|
|
| Topic ID | Topic Keywords | Topic Frequency | Label | |
|
|----------|----------------|-----------------|-------| |
|
| -1 | france - confinement - paris - january - february | 11 | -1_france_confinement_paris_january | |
|
| 0 | france - remplacement - chers - rn - chai | 726 | 0_france_remplacement_chers_rn | |
|
| 1 | france - confinement - musulmans - migrants - january | 2122 | 1_france_confinement_musulmans_migrants | |
|
| 2 | france - confinement - valeurs - adoxainfos - attestation | 383 | 2_france_confinement_valeurs_adoxainfos | |
|
| 3 | france - fabriquer - inondation - pleutrerie - noirs | 215 | 3_france_fabriquer_inondation_pleutrerie | |
|
| 4 | france - confinement - arbitre - migrants - victimes | 171 | 4_france_confinement_arbitre_migrants | |
|
| 5 | caceuphonie - 1950s - gregoryroose - injusticepouradrien - lundiabstinence | 141 | 5_caceuphonie_1950s_gregoryroose_injusticepouradrien | |
|
| 6 | france - confinement - morts - centrales - reconquete | 127 | 6_france_confinement_morts_centrales | |
|
| 7 | islamofolie - confinement - lea_antiracisme - valeurs - actuelles | 126 | 7_islamofolie_confinement_lea_antiracisme_valeurs | |
|
| 8 | france - musulmans - manifstopislamisme - migrants - gilets | 108 | 8_france_musulmans_manifstopislamisme_migrants | |
|
| 9 | france - migrants - moire - saint - liberte | 101 | 9_france_migrants_moire_saint | |
|
| 10 | nd675i9efw - xyc9onz4u6 - q6vpvgl3y8 - 7oft6k1w0t - fx1wrgvf62 | 83 | 10_nd675i9efw_xyc9onz4u6_q6vpvgl3y8_7oft6k1w0t | |
|
| 11 | lf5oyn1fv1 - w6uo2fhmmv - xmsnt2c3i4 - rahvt7fxwq - coor4crsqz | 64 | 11_lf5oyn1fv1_w6uo2fhmmv_xmsnt2c3i4_rahvt7fxwq | |
|
| 12 | confinement - tegner - saoudien - continental - dissolutionccif | 62 | 12_confinement_tegner_saoudien_continental | |
|
| 13 | wfh0de8qtc - z7p2rmw7a0 - oi5af1xkjs - 4cgk8oudwa - xyrgxsovtb | 54 | 13_wfh0de8qtc_z7p2rmw7a0_oi5af1xkjs_4cgk8oudwa | |
|
| 14 | france - onveutlesnoms - tvlofficiel - fabriquer - musulmans | 50 | 14_france_onveutlesnoms_tvlofficiel_fabriquer | |
|
| 15 | france - confinement - racismes - migrants - aristocratique | 45 | 15_france_confinement_racismes_migrants | |
|
| 16 | germains - gationnisme - subversif - flexitarien - gory | 44 | 16_germains_gationnisme_subversif_flexitarien | |
|
| 17 | clairs - foehn - candidatures - liberte - sifflet | 39 | 17_clairs_foehn_candidatures_liberte | |
|
| 18 | ugsbuxrvm1 - zfxy482pj7 - slpng_giants_fr - ton6pf8fjf - z9zsctlaw1 | 38 | 18_ugsbuxrvm1_zfxy482pj7_slpng_giants_fr_ton6pf8fjf | |
|
| 19 | greenconservatism - morts - gardetonvoile - treligionopeace - bandes | 35 | 19_greenconservatism_morts_gardetonvoile_treligionopeace | |
|
| 20 | ojim_france - boycottfrance - borisjohnson - claudechollet - confinementsaison2 | 32 | 20_ojim_france_boycottfrance_borisjohnson_claudechollet | |
|
| 21 | paronym_france - caricatural - jesuistellementblancque - valeurs - pape | 31 | 21_paronym_france_caricatural_jesuistellementblancque_valeurs | |
|
| 22 | france - hypocrite - occurrence - lune - militants | 29 | 22_france_hypocrite_occurrence_lune | |
|
| 23 | france - jambanja - remplacement - sanctions - paul | 26 | 23_france_jambanja_remplacement_sanctions | |
|
| 24 | boycottdecathlon - france - boycott - prochaines - accouchement | 25 | 24_boycottdecathlon_france_boycott_prochaines | |
|
| 25 | sifaoui - boycottdecathlon - miyandoab - victimes - gory | 20 | 25_sifaoui_boycottdecathlon_miyandoab_victimes | |
|
| 26 | morts - souvenirs - murmure - engueuler - thermale | 20 | 26_morts_souvenirs_murmure_engueuler | |
|
| 27 | emmanuelmacron - vincent_vauclin - franceinter - veillez - prises | 18 | 27_emmanuelmacron_vincent_vauclin_franceinter_veillez | |
|
| 28 | paronym_france - medinrecords - merciauxsoignants - krgkb7lfw3 - f9m1vggm6g | 18 | 28_paronym_france_medinrecords_merciauxsoignants_krgkb7lfw3 | |
|
| 29 | azecmnrh5t0 - pxdoezdfwj - ubqgl9qtfs - bqwibkek5p - cvp5sypmkg | 15 | 29_azecmnrh5t0_pxdoezdfwj_ubqgl9qtfs_bqwibkek5p | |
|
| 30 | france - pylo - january - cologisme - february | 14 | 30_france_pylo_january_cologisme | |
|
| 31 | confinement - valeurs - adoxainfos - conflanssaintehonorine - iran | 13 | 31_confinement_valeurs_adoxainfos_conflanssaintehonorine | |
|
| 32 | confinementjour6 - boycottdecathlon - exceptionnelle - marie - gilets | 13 | 32_confinementjour6_boycottdecathlon_exceptionnelle_marie | |
|
|
|
</details> |
|
|
|
## Training hyperparameters |
|
|
|
* calculate_probabilities: False |
|
* language: None |
|
* low_memory: False |
|
* min_topic_size: 10 |
|
* n_gram_range: (1, 1) |
|
* nr_topics: None |
|
* seed_topic_list: None |
|
* top_n_words: 10 |
|
* verbose: False |
|
* zeroshot_min_similarity: 0.7 |
|
* zeroshot_topic_list: None |
|
|
|
## Framework versions |
|
|
|
* Numpy: 1.23.5 |
|
* HDBSCAN: 0.8.38.post1 |
|
* UMAP: 0.5.6 |
|
* Pandas: 2.2.2 |
|
* Scikit-Learn: 1.5.1 |
|
* Sentence-transformers: 3.0.1 |
|
* Transformers: 4.44.2 |
|
* Numba: 0.60.0 |
|
* Plotly: 5.24.0 |
|
* Python: 3.10.12 |
|
|