--- tags: - bertopic library_name: bertopic pipeline_tag: text-classification --- # xsum_6789_3000_1500_test This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("KingKazma/xsum_6789_3000_1500_test") topic_model.get_topic_info() ``` ## Topic overview * Number of topics: 27 * Number of training documents: 1500
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | said - people - would - also - one | 10 | -1_said_people_would_also | | 0 | police - said - court - mr - found | 508 | 0_police_said_court_mr | | 1 | mr - us - said - president - military | 144 | 1_mr_us_said_president | | 2 | sport - team - world - race - champion | 136 | 2_sport_team_world_race | | 3 | wales - vote - party - said - labour | 96 | 3_wales_vote_party_said | | 4 | foul - win - right - box - half | 84 | 4_foul_win_right_box | | 5 | care - nhs - tax - said - health | 62 | 5_care_nhs_tax_said | | 6 | league - club - season - appearance - football | 50 | 6_league_club_season_appearance | | 7 | wicket - cricket - england - ball - test | 36 | 7_wicket_cricket_england_ball | | 8 | rate - share - bank - growth - price | 35 | 8_rate_share_bank_growth | | 9 | rugby - england - wales - player - ospreys | 31 | 9_rugby_england_wales_player | | 10 | school - teacher - education - child - council | 29 | 10_school_teacher_education_child | | 11 | road - crash - police - collision - barrier | 27 | 11_road_crash_police_collision | | 12 | fire - said - rescue - plane - injured | 27 | 12_fire_said_rescue_plane | | 13 | music - radio - band - singer - show | 27 | 13_music_radio_band_singer | | 14 | passenger - airport - railway - said - scotrail | 24 | 14_passenger_airport_railway_said | | 15 | museum - painting - said - collection - royal | 23 | 15_museum_painting_said_collection | | 16 | road - flooding - weather - beach - rain | 22 | 16_road_flooding_weather_beach | | 17 | eu - trade - european - bank - deal | 19 | 17_eu_trade_european_bank | | 18 | cell - cancer - ebola - disease - human | 18 | 18_cell_cancer_ebola_disease | | 19 | temperature - dr - glacier - heat - researcher | 16 | 19_temperature_dr_glacier_heat | | 20 | bitcoin - software - android - superfish - battery | 15 | 20_bitcoin_software_android_superfish | | 21 | club - football - league - manager - rodgers | 14 | 21_club_football_league_manager | | 22 | zwolle - pec - ajax - zidane - real | 13 | 22_zwolle_pec_ajax_zidane | | 23 | film - best - actress - role - gillan | 12 | 23_film_best_actress_role | | 24 | women - mexico - denmark - footed - romania | 12 | 24_women_mexico_denmark_footed | | 25 | dairy - comedy - uk - export - food | 10 | 25_dairy_comedy_uk_export |
## Training hyperparameters * calculate_probabilities: True * language: english * low_memory: False * min_topic_size: 10 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: False ## Framework versions * Numpy: 1.22.4 * HDBSCAN: 0.8.33 * UMAP: 0.5.3 * Pandas: 1.5.3 * Scikit-Learn: 1.2.2 * Sentence-transformers: 2.2.2 * Transformers: 4.31.0 * Numba: 0.57.1 * Plotly: 5.13.1 * Python: 3.10.12