top2vec / app /Top2Vec.py
derek-thomas's picture
derek-thomas HF staff
Update app/Top2Vec.py
b53e9b8
import streamlit as st
from utilities import initialization
st.set_page_config(page_title="Top2Vec", layout="wide")
initialization()
vb_link = 'https://visitor-badge.glitch.me/badge?page_id=demo-org.Top2Vec&left_color=gray&right_color=blue'
visitor_badge = f"![Total Visitors]({vb_link})"
st.markdown(
f"""
# Introduction
This is [space](https://huggingface.co/spaces) dedicated to using [top2vec](https://github.com/ddangelov/Top2Vec) and showing what features are available for semantic searching and topic modeling.
Please check out this [readme](https://github.com/ddangelov/Top2Vec#how-does-it-work) to better understand how it works.
> Top2Vec is an algorithm for **topic modeling** and **semantic search**. It automatically detects topics present in text and generates jointly embedded topic, document and word vectors.
# Setup
I used the [20 NewsGroups](https://huggingface.co/datasets/SetFit/20_newsgroups) dataset with `top2vec`.
I fit on the dataset and reduced the topics to 20.
The topics are created from top2vec, not the labels.
No analysis on the top 20 topics vs labels is provided.
# Usage
Check out
- The [Topic Explorer](/Topic_Explorer) page to understand what topic were detected
- The [Document Explorer](/Document_Explorer) page to visually explore documents
- The [Semantic Search](/Semantic_Search) page to search by meaning
{visitor_badge}
"""
)