m. polinsky commited on
Commit
5e0d56b
1 Parent(s): aabe8a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -4,9 +4,17 @@
4
 
5
  #### [![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://share.streamlit.io/mpolinsky/topicdig/main)
6
 
7
- This application was created as the culmination of a semester of independent graduate research into NLP and transformers.
8
-
9
  The app displays topics, the user chooses up to three, and the app spins up a topical digest scraped from the headlines.
10
  This project makes heavy use of HuggingFace for NLP, and Gazpacho for web scraping.
11
 
 
 
 
 
 
 
 
 
 
 
12
  Original repo for the earlier version of this app is located at https://github.com/mpolinsky/sju_final_project/
 
4
 
5
  #### [![Streamlit App](https://static.streamlit.io/badges/streamlit_badge_black_white.svg)](https://share.streamlit.io/mpolinsky/topicdig/main)
6
 
 
 
7
  The app displays topics, the user chooses up to three, and the app spins up a topical digest scraped from the headlines.
8
  This project makes heavy use of HuggingFace for NLP, and Gazpacho for web scraping.
9
 
10
+ **The pipeline:**
11
+
12
+ * Current headlines are scraped from two news sites.
13
+ * NER is performed on each headline to extract topics, some headlines yield no topics.
14
+ * Article links are clustered according to entities in their headlines
15
+ * User selects up to three clusters
16
+ * Articles from those clusters are scraped, the articles summarized in chunks, and the summaries concatenated to create a digest.
17
+
18
+ This application was created as the culmination of a semester of independent graduate research into NLP and transformers.
19
+
20
  Original repo for the earlier version of this app is located at https://github.com/mpolinsky/sju_final_project/