--- tags: - bertopic library_name: bertopic --- # BERTopic_Multimodal This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. This model was trained on 8000 images from Flickr **without** the captions. This demonstrates how BERTopic can be used for topic modeling using images as input only. A few examples of generated topics: !["multimodal.png"](multimodal.png) ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic[vision] pip install -U safetensors ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("MaartenGr/BERTopic_Multimodal") topic_model.get_topic_info() ``` You can view all information about a topic as follows: ```python topic_model.get_topic(topic_id, full=True) ``` ## Topic overview * Number of topics: 29 * Number of training documents: 8091
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | while - air - the - in - jumping | 34 | -1_while_air_the_in | | 0 | bench - sitting - people - woman - street | 1132 | 0_bench_sitting_people_woman | | 1 | grass - running - dog - grassy - field | 1693 | 1_grass_running_dog_grassy | | 2 | boy - girl - little - young - holding | 1290 | 2_boy_girl_little_young | | 3 | dog - frisbee - running - water - mouth | 1224 | 3_dog_frisbee_running_water | | 4 | skateboard - ramp - doing - trick - cement | 415 | 4_skateboard_ramp_doing_trick | | 5 | snow - dog - covered - running - through | 309 | 5_snow_dog_covered_running | | 6 | mountain - range - slope - standing - person | 205 | 6_mountain_range_slope_standing | | 7 | pool - blue - boy - toy - water | 189 | 7_pool_blue_boy_toy | | 8 | trail - bike - down - riding - person | 166 | 8_trail_bike_down_riding | | 9 | snowboarder - mid - jump - air - after | 126 | 9_snowboarder_mid_jump_air | | 10 | rock - climbing - up - wall - tree | 124 | 10_rock_climbing_up_wall | | 11 | wave - surfboard - top - riding - of | 112 | 11_wave_surfboard_top_riding | | 12 | beach - surfboard - people - with - walking | 102 | 12_beach_surfboard_people_with | | 13 | jumping - track - horse - racquet - dog | 98 | 13_jumping_track_horse_racquet | | 14 | snowboard - snow - girl - hill - slope | 95 | 14_snowboard_snow_girl_hill | | 15 | game - being - football - played - professional | 91 | 15_game_being_football_played | | 16 | soccer - kicking - team - ball - player | 80 | 16_soccer_kicking_team_ball | | 17 | dirt - bike - person - rider - going | 75 | 17_dirt_bike_person_rider | | 18 | soccer - boys - field - ball - kicking | 69 | 18_soccer_boys_field_ball | | 19 | baseball - player - bat - swinging - into | 63 | 19_baseball_player_bat_swinging | | 20 | basketball - up - and - playing - jumping | 59 | 20_basketball_up_and_playing | | 21 | bird - body - flying - over - long | 55 | 21_bird_body_flying_over | | 22 | motorcycle - track - race - racer - racing | 55 | 22_motorcycle_track_race_racer | | 23 | boat - sitting - water - lake - hose | 53 | 23_boat_sitting_water_lake | | 24 | street - riding - down - bike - woman | 52 | 24_street_riding_down_bike | | 25 | paddle - suit - paddling - water - in | 49 | 25_paddle_suit_paddling_water | | 26 | pair - scissors - stage - white - shirt | 42 | 26_pair_scissors_stage_white | | 27 | tennis - court - racket - racquet - swinging | 34 | 27_tennis_court_racket_racquet |
## Training Procedure The data was retrieved as follows: ```python import os import glob import zipfile import numpy as np import pandas as pd from tqdm import tqdm from sentence_transformers import util # Flickr 8k images img_folder = 'photos/' caps_folder = 'captions/' if not os.path.exists(img_folder) or len(os.listdir(img_folder)) == 0: os.makedirs(img_folder, exist_ok=True) if not os.path.exists('Flickr8k_Dataset.zip'): #Download dataset if does not exist util.http_get('https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_Dataset.zip', 'Flickr8k_Dataset.zip') util.http_get('https://github.com/jbrownlee/Datasets/releases/download/Flickr8k/Flickr8k_text.zip', 'Flickr8k_text.zip') for folder, file in [(img_folder, 'Flickr8k_Dataset.zip'), (caps_folder, 'Flickr8k_text.zip')]: with zipfile.ZipFile(file, 'r') as zf: for member in tqdm(zf.infolist(), desc='Extracting'): zf.extract(member, folder) images = list(glob.glob('photos/Flicker8k_Dataset/*.jpg')) ``` Then, to perform topic modeling on multimodal data with BERTopic: ```python from bertopic import BERTopic from bertopic.backend import MultiModalBackend from bertopic.representation import VisualRepresentation, KeyBERTInspired # Image embedding model embedding_model = MultiModalBackend('clip-ViT-B-32', batch_size=32) # Image to text representation model representation_model = { "Visual_Aspect": VisualRepresentation(image_to_text_model="nlpconnect/vit-gpt2-image-captioning", image_squares=True), "KeyBERT": KeyBERTInspired() } # Train our model with images only topic_model = BERTopic(representation_model=representation_model, verbose=True, embedding_model=embedding_model, min_topic_size=30) topics, probs = topic_model.fit_transform(documents=None, images=images) ``` The above demonstrates that the input were only images. These images are clustered and from those clusters a small subset of representative images are extracted. The representative images are captioned using `"nlpconnect/vit-gpt2-image-captioning"` to generate a small textual dataset over which we can run c-TF-IDF and the additional `KeyBERTInspired` representation model. ## Training hyperparameters * calculate_probabilities: False * language: None * low_memory: False * min_topic_size: 30 * n_gram_range: (1, 1) * nr_topics: None * seed_topic_list: None * top_n_words: 10 * verbose: True ## Framework versions * Numpy: 1.23.5 * HDBSCAN: 0.8.29 * UMAP: 0.5.3 * Pandas: 1.5.3 * Scikit-Learn: 1.2.2 * Sentence-transformers: 2.2.2 * Transformers: 4.29.2 * Numba: 0.56.4 * Plotly: 5.14.1 * Python: 3.10.10