Spaces:
Running
Running
FoodDesert
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -9,4 +9,41 @@ app_file: app.py
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
pinned: false
|
10 |
---
|
11 |
|
12 |
+
|
13 |
+
## Frequently Asked Questions (FAQs)
|
14 |
+
|
15 |
+
Technically I am writing this before anyone but me has used the tool, so no one has asked questions yet. But if they did, here are the questions I think they might ask:
|
16 |
+
|
17 |
+
### Does input order matter?
|
18 |
+
|
19 |
+
No
|
20 |
+
|
21 |
+
### Should I use underscores in the input tags?
|
22 |
+
|
23 |
+
It doesn't matter. The application handles tags either way.
|
24 |
+
|
25 |
+
### Why are some valid tags marked as "unseen", and why don't some artists ever get returned?
|
26 |
+
|
27 |
+
Some data is excluded from consideration if it did not occur frequently enough in the sample from which the application makes its calculations.
|
28 |
+
If an artist or tag is too infrequent, we might not think we have enough data to make predictions about it.
|
29 |
+
|
30 |
+
### Are there any special tags?
|
31 |
+
|
32 |
+
Yes. We normalized the favorite counts of each image to a range of 0-9, with 0 being the lowest favcount, and 9 being the highest.
|
33 |
+
You can include any of these special tags: "score:0", "score:1", "score:2", "score:3", "score:4", "score:5", "score:6", "score:7", "score:8", "score:9"
|
34 |
+
in your list to bias the output toward artists with higher or lower scoring images.
|
35 |
+
|
36 |
+
### Are there any other special tricks?
|
37 |
+
|
38 |
+
Yes. If you want to more strongly bias the artist output toward a specific tag, you can just list it multiple times.
|
39 |
+
So for example, the query "red fox, red fox, red fox, score:7" will yield a list of artists who are more strongly associated with the tag "red fox"
|
40 |
+
than the query "red fox, score:7".
|
41 |
+
|
42 |
+
### What calculation is this thing actually performing?
|
43 |
+
|
44 |
+
Each artist is represented by a "pseudo-document" composed of all the tags from their uploaded images, treating these tags similarly to words in a text document.
|
45 |
+
Similarly, when you input a set of tags, the system creates a pseudo-document for your query out of all the tags.
|
46 |
+
It then uses a technique called cosine similarity to compare your tags against each artist's collection, essentially finding which artist's tags are most "similar" to yours.
|
47 |
+
This method helps identify artists whose work is closely aligned with the themes or elements you're interested in.
|
48 |
+
For those curious about the underlying mechanics of comparing text-like data, we employ the TF-IDF (Term Frequency-Inverse Document Frequency) method, a standard approach in information retrieval.
|
49 |
+
You can read more about TF-IDF on its [Wikipedia page](https://en.wikipedia.org/wiki/Tf%E2%80%93idf).
|