edugp commited on
Commit
6d1a001
1 Parent(s): ab7449f

Update README

Browse files
Files changed (1) hide show
  1. README.md +35 -30
README.md CHANGED
@@ -1,37 +1,42 @@
1
  ---
2
  title: Perplexity Lenses
3
- emoji: 🔥
4
- colorFrom: gray
5
- colorTo: indigo
6
  sdk: streamlit
7
  app_file: app.py
8
  pinned: false
9
  ---
10
 
11
- # Configuration
12
-
13
- `title`: _string_
14
- Display title for the Space
15
-
16
- `emoji`: _string_
17
- Space emoji (emoji-only character allowed)
18
-
19
- `colorFrom`: _string_
20
- Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)
21
-
22
- `colorTo`: _string_
23
- Color for Thumbnail gradient (red, yellow, green, blue, indigo, purple, pink, gray)
24
-
25
- `sdk`: _string_
26
- Can be either `gradio` or `streamlit`
27
-
28
- `sdk_version` : _string_
29
- Only applicable for `streamlit` SDK.
30
- See [doc](https://hf.co/docs/hub/spaces) for more info on supported versions.
31
-
32
- `app_file`: _string_
33
- Path to your main application file (which contains either `gradio` or `streamlit` Python code).
34
- Path is relative to the root of the repository.
35
-
36
- `pinned`: _boolean_
37
- Whether the Space stays on top of your list.
 
 
 
 
 
 
1
  ---
2
  title: Perplexity Lenses
3
+ emoji: 🌸
4
+ colorFrom: pink
5
+ colorTo: blue
6
  sdk: streamlit
7
  app_file: app.py
8
  pinned: false
9
  ---
10
 
11
+ # Installation:
12
+ Requires Python >= 3.7 and < 3.10
13
+ ```
14
+ pip install -r requirements.txt
15
+ ```
16
+
17
+ # Web App:
18
+ The app is hosted [here](https://huggingface.co/spaces/edugp/perplexity-lenses). To run it locally:
19
+ ```
20
+ python -m streamlit run app.py
21
+ ```
22
+
23
+ # CLI
24
+ The CLI with no arguments defaults to running mc4 in Spanish.
25
+ For full usage:
26
+ ```
27
+ python cli.py --help
28
+ ```
29
+ Example: Running on 1000 sentences extracted from Spanish OSCAR docs specifying all arguments:
30
+ ```
31
+ python cli.py \
32
+ --dataset oscar \
33
+ --dataset-config unshuffled_deduplicated_es \
34
+ --dataset-split train \
35
+ --text-column text \
36
+ --language es \
37
+ --doc-type sentence \
38
+ --sample 1000 \
39
+ --dimensionality-reduction umap \
40
+ --model-name distiluse-base-multilingual-cased-v1 \
41
+ --output-file perplexity.html
42
+ ```