smhavens commited on
Commit
0114c21
1 Parent(s): fe56f10

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -13,10 +13,15 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
13
 
14
  ## Information
15
  ### Database
16
- [GLUE](https://huggingface.co/datasets/glue)
 
 
 
 
17
 
18
  ### Pre-trained model
19
  [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
20
 
 
21
 
22
- TESTING README UPDATE
 
13
 
14
  ## Information
15
  ### Database
16
+ [ag_news]([https://huggingface.co/datasets/glue](https://huggingface.co/datasets/ag_news)
17
+
18
+ This database uses a text with label format, with each label being an integer between 0 and 3, relating to the 4 main categories of the news: World (0), Sports (1), Business (2), Sci/Tech (3).
19
+
20
+ I chose this one because of the larger variety of categories compared to sentiment databases, with the themes/categories theoretically being more closely related to analogies. I also chose ag_news because, as a news source, it should avoid slang and other potential hiccups that databases using tweets or general reviews will have.
21
 
22
  ### Pre-trained model
23
  [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
24
 
25
+ Because my focus is on using embeddings to evaluate analogies for the AnalogyArcade, I focused my model search for those in the sentence-transformers category, as they are readily made for embedding usage. I chose all-MiniLM-L6-v2 because of its high usage and good reviews: it is a well trained model but smaller and more efficient than its previous version.
26
 
27
+ TESTING README UPDATE