topic_modelling / app.py

Commit History

Upgraded to Gradio 4.16.0. Guide for converting to exe added.
0a177ca

Sonnyjim commited on

Note about LLM not working now successfully added!
e2dfc1e

Sean-Case commited on

Some text changes. Fixed a couple of TF-IDF embeddings issues
87306c7

Sean-Case commited on

Switched embeddings to low resource TF-IDF by default. Some text changes.
a7fdf3b

Sean-Case commited on

Added clean data options, improved re-representation options and visualisation. General format changes
4effac0

Sonnyjim commited on

Allowed for loading in external topic labels. A few visualisation modifications.
b27bab2

Sonnyjim commited on

Lots of general fixes. New visualisations, fixed hierarchical vis for zero shot. Added calc all probabilities.
b4510a6

Sonnyjim commited on

Changed Phi model to smaller StableLM 2 1.6. Fixed a None type detection error.
1f1a1c7

Sonnyjim commited on

Disabled console logging as it was getting in the way of file load into the app
731ed23

Sonnyjim commited on

Switched embeddings model to BGE Small 1.5 as Jina seemed unable to do zero shot topic modelling properly
be094ee

Sonnyjim commited on

Added minimum similarity slider for zero shot topic modelling
0fe5421

Sean-Case commited on

model and hierarchy details should now save properly
6622531

Sonnyjim commited on

Split off LLM representation, visualisation, and reduce outliers from main function. Added hierarchical visualisation and logs
5d87c3c

Sonnyjim commited on

More efficient embeddings save and representations load/process. Custom visualisation hover option added, formatting improvements. Version 0.1?
ffe5eb2

Sonnyjim commited on

App should now check if embeddings are loaded before topic modelling. And will save only once.
9eeba1e

Sonnyjim commited on

Added option to reduce outliers based on closest topic
e09dd3b

Sonnyjim commited on

Returned TruncatedSVD components to 100 - higher values don't seem to help
43ac0d8

Sean-Case commited on

Greatly increased low resource process dimensions for higher quality. Visualisations disabled by default to increase speed.
fac3624

Sean-Case commited on

Greatly improved low resource mode speed (at cost of potential quality)
aa3df37

Sean-Case commited on

Changed zero shot min similarity to 0.5
0b7839c

Sonnyjim commited on

Added controls for saving topic models and visualisation. Removed custom UMAP layer
81f1b56

Sonnyjim commited on

Now should save embeddings by default. Added random seed to representation
e0f53cc

Sean-Case commited on

Fixed llm_config, environmental variable, zero shot topic model errors with quick embeddings
ff32b4a

Sean-Case commited on

Model export changed to safetensors. Improved representational model function. Got zero shot topic modelling working
4cfed8e

Sean-Case commited on

Removed reference to environmental variable
72f2310

Sonnyjim commited on

first commit
9dbf344

Sonnyjim commited on