D0men1c0
/

ISSR_Visual_Model

Model card Files Files and versions Community

ISSR_Visual_Model / README.md

D0men1c0's picture

Update README.md

ba6d952 verified 5 months ago

|

2.52 kB


	---
	tags:
	- bertopic
	library_name: bertopic
	---

	# ISSR_Visual_Model

	This is a [BERTopic](https://github.com/MaartenGr/BERTopic) model.
	BERTopic is a flexible and modular topic modeling framework that allows for the generation of easily interpretable topics from large datasets.

	## Usage

	To use this model, please install BERTopic:

	```
	pip install -U bertopic
	```

	You can use the model as follows:

	```python
	from bertopic import BERTopic
	topic_model = BERTopic.load("D0men1c0/ISSR_Visual_Model")

	topic_model.get_topic_info()
	```

	You can make predictions as follows:
	```python
	val_labels = [...] # list of caption
	val_images = [...] # list of images

	topic, _ = topic_model.transform(val_labels, images=val_images)
	all_topic_info = [topic_model.get_topic_info(t) for t in topic]
	all_prediction_info = pd.concat(all_topic_info, ignore_index=True)

	# Visualize predictions:
	sample_images = 100
	n_images = min(sample_images, len(val_images))
	n_cols = 4
	n_rows = math.ceil(n_images / n_cols)

	fig, axes = plt.subplots(n_rows, n_cols, figsize=(15, n_rows * 3))
	axes = axes.flatten()

	for i, (path, (_, row)) in enumerate(zip(val_images[:n_images], all_prediction_info.iterrows())):
	ax = axes[i]
	ax.imshow(Image.open(path))
	ax.axis('off')
	ax.set_title(f"Topic {row['Topic']}: {row['KeyBERTInspired'][0]}")

	# Hide unused axes
	for j in range(n_images, len(axes)):
	axes[j].axis('off')

	plt.tight_layout()
	plt.show()
	```

	## Topic overview

	* Number of topics: 5
	* Number of training documents: 2997

	<details>
	<summary>Click here for an overview of all topics.</summary>

	\| Topic ID \| Topic Keywords \| Topic Frequency \| Label \|
	\|----------\|----------------\|-----------------\|-------\|
	\| -1 \| drug - people - gun - - \| 151 \| -1_drug_people_gun_ \|
	\| 0 \| gun - people - drug - - \| 2152 \| 0_gun_people_drug_ \|
	\| 1 \| drug - gun - - - \| 342 \| 1_drug_gun__ \|
	\| 2 \| people - gun - - - \| 287 \| 2_people_gun__ \|
	\| 3 \| people - gun - drug - - \| 65 \| 3_people_gun_drug_ \|

	</details>

	## Training hyperparameters

	* calculate_probabilities: False
	* language: None
	* low_memory: False
	* min_topic_size: 50
	* n_gram_range: (1, 3)
	* nr_topics: None
	* seed_topic_list: None
	* top_n_words: 5
	* verbose: True
	* zeroshot_min_similarity: 0.7
	* zeroshot_topic_list: None

	## Framework versions

	* Numpy: 1.26.4
	* HDBSCAN: 0.8.36
	* UMAP: 0.5.6
	* Pandas: 2.2.2
	* Scikit-Learn: 1.4.1.post1
	* Sentence-transformers: 3.0.1
	* Transformers: 4.39.3
	* Numba: 0.60.0
	* Plotly: 5.22.0
	* Python: 3.12.4