Spaces:
Running
Running
Removed reference to environmental variable
Browse files- .gitignore +1 -0
- Topic modeller to do.txt +13 -0
- app.py +1 -2
.gitignore
CHANGED
@@ -3,6 +3,7 @@
|
|
3 |
*.npz
|
4 |
*.csv
|
5 |
*.pkl
|
|
|
6 |
.ipynb_checkpoints/*
|
7 |
old_code/*
|
8 |
model/*
|
|
|
3 |
*.npz
|
4 |
*.csv
|
5 |
*.pkl
|
6 |
+
*.parquet
|
7 |
.ipynb_checkpoints/*
|
8 |
old_code/*
|
9 |
model/*
|
Topic modeller to do.txt
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Need to add option to anonymise - done
|
2 |
+
|
3 |
+
Need to add option to deduplicate
|
4 |
+
|
5 |
+
Need option to sample for X number of rows with specific seed
|
6 |
+
|
7 |
+
Add plotly visualisation - done
|
8 |
+
|
9 |
+
Add zero shot topic list support
|
10 |
+
|
11 |
+
Add topic renaming with LLMs - done
|
12 |
+
|
13 |
+
Option to predict topics on a new dataset - done (kind of - just save model to file)
|
app.py
CHANGED
@@ -3,8 +3,7 @@ import os
|
|
3 |
#os.environ["TOKENIZERS_PARALLELISM"] = "true"
|
4 |
#os.environ["HF_HOME"] = "/mnt/c/..."
|
5 |
#os.environ["CUDA_PATH"] = "/mnt/c/..."
|
6 |
-
|
7 |
-
print(os.environ["HF_HOME"])
|
8 |
|
9 |
import gradio as gr
|
10 |
from datetime import datetime
|
|
|
3 |
#os.environ["TOKENIZERS_PARALLELISM"] = "true"
|
4 |
#os.environ["HF_HOME"] = "/mnt/c/..."
|
5 |
#os.environ["CUDA_PATH"] = "/mnt/c/..."
|
6 |
+
#print(os.environ["HF_HOME"])
|
|
|
7 |
|
8 |
import gradio as gr
|
9 |
from datetime import datetime
|