--- tags: - bertopic library_name: bertopic --- # ISSR_Visual_Model This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model. BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets. ## Usage To use this model, please install BERTopic: ``` pip install -U bertopic ``` You can use the model as follows: ```python from bertopic import BERTopic topic_model = BERTopic.load("D0men1c0/ISSR_Visual_Model") topic_model.get_topic_info() ``` You can make predictions as follows: ```python val_labels = [...] # list of caption val_images = [...] # list of images topic, _ = topic_model.transform(val_labels, images=val_images) all_topic_info = [topic_model.get_topic_info(t) for t in topic] all_prediction_info = pd.concat(all_topic_info, ignore_index=True) # Visualize predictions: sample_images = 100 n_images = min(sample_images, len(val_images)) n_cols = 4 n_rows = math.ceil(n_images / n_cols) fig, axes = plt.subplots(n_rows, n_cols, figsize=(15, n_rows * 3)) axes = axes.flatten() for i, (path, (_, row)) in enumerate(zip(val_images[:n_images], all_prediction_info.iterrows())): ax = axes[i] ax.imshow(Image.open(path)) ax.axis('off') ax.set_title(f"Topic {row['Topic']}: {row['KeyBERTInspired'][0]}") # Hide unused axes for j in range(n_images, len(axes)): axes[j].axis('off') plt.tight_layout() plt.show() ``` ## Topic overview * Number of topics: 5 * Number of training documents: 2997
Click here for an overview of all topics. | Topic ID | Topic Keywords | Topic Frequency | Label | |----------|----------------|-----------------|-------| | -1 | drug - people - gun - - | 151 | -1_drug_people_gun_ | | 0 | gun - people - drug - - | 2152 | 0_gun_people_drug_ | | 1 | drug - gun - - - | 342 | 1_drug_gun__ | | 2 | people - gun - - - | 287 | 2_people_gun__ | | 3 | people - gun - drug - - | 65 | 3_people_gun_drug_ |
## Training hyperparameters * calculate_probabilities: False * language: None * low_memory: False * min_topic_size: 50 * n_gram_range: (1, 3) * nr_topics: None * seed_topic_list: None * top_n_words: 5 * verbose: True * zeroshot_min_similarity: 0.7 * zeroshot_topic_list: None ## Framework versions * Numpy: 1.26.4 * HDBSCAN: 0.8.36 * UMAP: 0.5.6 * Pandas: 2.2.2 * Scikit-Learn: 1.4.1.post1 * Sentence-transformers: 3.0.1 * Transformers: 4.39.3 * Numba: 0.60.0 * Plotly: 5.22.0 * Python: 3.12.4