import streamlit as st from utilities import initialization st.set_page_config(page_title="Top2Vec", layout="wide") initialization() vb_link = 'https://visitor-badge.glitch.me/badge?page_id=demo-org.Top2Vec&left_color=gray&right_color=blue' visitor_badge = f"![Total Visitors]({vb_link})" st.markdown( f""" # Introduction This is [space](https://huggingface.co/spaces) dedicated to using [top2vec](https://github.com/ddangelov/Top2Vec) and showing what features are available for semantic searching and topic modeling. Please check out this [readme](https://github.com/ddangelov/Top2Vec#how-does-it-work) to better understand how it works. > Top2Vec is an algorithm for **topic modeling** and **semantic search**. It automatically detects topics present in text and generates jointly embedded topic, document and word vectors. # Setup I used the [20 NewsGroups](https://huggingface.co/datasets/SetFit/20_newsgroups) dataset with `top2vec`. I fit on the dataset and reduced the topics to 20. The topics are created from top2vec, not the labels. No analysis on the top 20 topics vs labels is provided. # Usage Check out - The [Topic Explorer](/Topic_Explorer) page to understand what topic were detected - The [Document Explorer](/Document_Explorer) page to visually explore documents - The [Semantic Search](/Semantic_Search) page to search by meaning {visitor_badge} """ )