metadata
tags:
- bertopic
library_name: bertopic
pipeline_tag: text-classification
transformers_amazon_reviews_topics
This is a BERTopic model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.
Usage
To use this model, please install BERTopic:
pip install -U bertopic
You can use the model as follows:
from bertopic import BERTopic
topic_model = BERTopic.load("inesbattah/transformers_amazon_reviews_topics")
topic_model.get_topic_info()
Topic overview
- Number of topics: 30
- Number of training documents: 9000
Click here for an overview of all topics.
Topic ID | Topic Keywords | Topic Frequency | Label |
---|---|---|---|
-1 | amazon - quality - product - cheap - seller | 10 | -1_amazon_quality_product_cheap |
0 | refund - ordered - order - delivered - return | 3105 | 0_refund_ordered_order_delivered |
1 | charging - charger - charge - iphone - headphones | 1556 | 1_charging_charger_charge_iphone |
2 | wear - shoe - shoes - zipper - fit | 655 | 2_wear_shoe_shoes_zipper |
3 | shampoo - conditioner - scent - flavor - hair | 635 | 3_shampoo_conditioner_scent_flavor |
4 | protector - protectors - screen - case - cases | 452 | 4_protector_protectors_screen_case |
5 | color - colors - colored - blue - black | 293 | 5_color_colors_colored_blue |
6 | bottle - leak - leaking - bottles - leaks | 234 | 6_bottle_leak_leaking_bottles |
7 | lights - light - bulbs - flashlight - led | 209 | 7_lights_light_bulbs_flashlight |
8 | dog - toy - dogs - puppy - chewed | 205 | 8_dog_toy_dogs_puppy |
9 | chairs - chair - assemble - screws - assembling | 192 | 9_chairs_chair_assemble_screws |
10 | cheap - cheaply - material - quality - cost | 181 | 10_cheap_cheaply_material_quality |
11 | book - books - chapters - chapter - author | 180 | 11_book_books_chapters_chapter |
12 | hose - faucet - pump - valve - leak | 167 | 12_hose_faucet_pump_valve |
13 | pan - pans - pancakes - griddle - cook | 127 | 13_pan_pans_pancakes_griddle |
14 | dvd - dvds - disc - discs - cd | 114 | 14_dvd_dvds_disc_discs |
15 | fit - fitting - didnt - galaxy - samsung | 109 | 15_fit_fitting_didnt_galaxy |
16 | razor - shave - razors - reviews - blades | 97 | 16_razor_shave_razors_reviews |
17 | cartridges - cartridge - ink - printer - printing | 97 | 17_cartridges_cartridge_ink_printer |
18 | watches - watch - clocks - clock - battery | 88 | 18_watches_watch_clocks_clock |
19 | remote - remotes - buttons - button - programmed | 78 | 19_remote_remotes_buttons_button |
20 | seeds - seed - planted - planting - germinated | 43 | 20_seeds_seed_planted_planting |
21 | thermometer - temperature - temperatureoff - temps - temp | 36 | 21_thermometer_temperature_temperatureoff_temps |
22 | instructions - directions - how - installation - cheap | 34 | 22_instructions_directions_how_installation |
23 | pistol - holster - gun - glock19 - glock | 29 | 23_pistol_holster_gun_glock19 |
24 | tire - tires - tube - bike - wheel | 20 | 24_tire_tires_tube_bike |
25 | snoring - snorkeling - snore - snorkel - snores | 17 | 25_snoring_snorkeling_snore_snorkel |
26 | rugs - carpets - carpet - rug - floors | 13 | 26_rugs_carpets_carpet_rug |
27 | waterproof - wet - swimming - bathing - raining | 12 | 27_waterproof_wet_swimming_bathing |
28 | fan - squealing - noise - fans - quiet | 12 | 28_fan_squealing_noise_fans |
Training hyperparameters
- calculate_probabilities: False
- language: english
- low_memory: False
- min_topic_size: 10
- n_gram_range: (1, 1)
- nr_topics: 30
- seed_topic_list: None
- top_n_words: 10
- verbose: True
- zeroshot_min_similarity: 0.7
- zeroshot_topic_list: None
Framework versions
- Numpy: 1.26.4
- HDBSCAN: 0.8.39
- UMAP: 0.5.7
- Pandas: 2.2.2
- Scikit-Learn: 1.5.2
- Sentence-transformers: 3.2.1
- Transformers: 4.44.2
- Numba: 0.60.0
- Plotly: 5.24.1
- Python: 3.10.12