Spaces:

andyqin18
/

sentiment-analysis-app

Running

App Files Files Community

andyqin18 commited on Apr 10, 2023

Commit

b6852b8

•

1 Parent(s): 228ca50

Finished milestone2

Browse files

Files changed (6) hide show

README.md +56 -67
app.py +21 -13
milestone2/HF_token.png +0 -0
milestone2/app_UI.png +0 -0
milestone2/github_token.png +0 -0
milestone2/new_HF_space.png +0 -0

README.md CHANGED Viewed

@@ -13,93 +13,82 @@ pinned: false
 Hello! This is a project for CS-UY 4613: Artificial Intelligence. I'm providing a step-by-step instruction on finetuning language models for detecting toxic tweets.
-# Milestone 1
-This milestone includes setting up docker and creating a development environment on Windows 11.
-## 1. Enable WSL2 feature
-The Windows Subsystem for Linux (WSL) lets developers install a Linux distribution on Windows.
-```
-wsl --install
-```
-Ubuntu is the default distribution installed and WSL2 is the default version.
-After creating linux username and password, Ubuntu can be seen in Windows Terminal now.
-Details can be found [here](https://learn.microsoft.com/en-us/windows/wsl/install).
-![](milestone1/wsl2.png)
-## 2. Download and install the Linux kernel update package
-The package needs to be downloaded before installing Docker Desktop.
-However, this error might occur:
-`Error: wsl_update_x64.msi unable to run because "This update only applies to machines with the Windows Subsystem for Linux"`
-Solution: Opened Windows features and enabled "Windows Subsystem for Linux".
-Successfully ran update [package](https://docs.microsoft.com/windows/wsl/wsl2-kernel).
-![](milestone1/kernal_update_sol.png)
-## 3. Download Docker Desktop
-After downloading the [Docker App](https://www.docker.com/products/docker-desktop/), WSL2 based engine is automatically enabled.
-If not, follow [this link](https://docs.docker.com/desktop/windows/wsl/) for steps to turn on WSL2 backend.
-Open the app and input `docker version` in Terminal to check server running.
-![](milestone1/docker_version.png)
-Docker is ready to go.
-## 4. Create project container and image
-First we download the Ubuntu image from Docker’s library with:
-```
-docker pull ubuntu
-```
-We can check the available images with:
 ```
-docker image ls
 ```
-We can create a container named *AI_project* based on Ubuntu image with:
-```
-docker run -it --name=AI_project ubuntu
-```
-The `–it` options instruct the container to launch in interactive mode and enable a Terminal typing interface.
-After this, a shell is generated and we are directed to Linux Terminal within the container.
-`root` represents the currently logged-in user with highest privileges, and `249cf37645b4` is the container ID.
-![](milestone1/docker_create_container.png)
-## 5. Hello World!
-Now we can mess with the container by downloading python and pip needed for the project.
-First we update and upgrade packages by: (`apt` is Advanced Packaging Tool)
-```
-apt update && apt upgrade
-```
-Then we download python and pip with:
-```
-apt install python3 pip
-```
-We can confirm successful installation by checking the current version of python and pip.
-Then create a script file of *hello_world.py* under `root` directory, and run the script.
-You will see the following in VSCode and Terminal.
-![](milestone1/vscode.png)
-![](milestone1/hello_world.png)
-## 6. Commit changes to a new image specifically for the project
-After setting up the container we can commit changes to a specific project image with a tag of *milestone1* with:
-```
-docker commit [CONTAINER] [NEW_IMAGE]:[TAG]
-```
-Now if we check the available images there should be a new image for the project. If we list all containers we should be able to identify the one we were working on through container ID.
-![](milestone1/commit_to_new_image.png)
-The Docker Desktop app should match the image list we see on Terminal.
-![](milestone1/app_image_list.png)

 Hello! This is a project for CS-UY 4613: Artificial Intelligence. I'm providing a step-by-step instruction on finetuning language models for detecting toxic tweets.
+# Milestone 2
+This milestone includes creating a Streamlit app in HuggingFace for sentiment analysis.
+Link to app: https://huggingface.co/spaces/andyqin18/sentiment-analysis-app
+## 1. Space setup
+After creating a HuggingFace account, we can create our app as a space and choose Streamlit as the space SDK.
+![](milestone2/new_HF_space.png)
+Then we can go back to our Github Repo and create the following files.
+In order for the space to run properly, there must be at least three files in the root directory:
+[README.md](README.md), [app.py](app.py), and [requirements.txt](requirements.txt)
+Make sure the following metadata is at the top of **README.md** for HuggingFace to identify.
+```
+---
+title: Sentiment Analysis App
+emoji: 🚀
+colorFrom: green
+colorTo: purple
+sdk: streamlit
+sdk_version: 1.17.0
+app_file: app.py
+pinned: false
+---
+```
+The **app.py** file is the main code of the app and **requirements.txt** should include all the libraries the code uses. HuggingFace will install the libraries listed before running the virtual environment
+## 2. Connect and sync to HuggingFace
+Then we go to settings of the Github Repo and create a secret token to access the new HuggingFace space.
+![](milestone2/HF_token.png)
+![](milestone2/github_token.png)
+Next, we need to setup a workflow in Github Actions. Click "set up a workflow yourself" and replace all the code in `main.yaml` with the following: (Replace `HF_USERNAME` and `SPACE_NAME` with our own)
 ```
+name: Sync to Hugging Face hub
+on:
+  push:
+    branches: [main]
+  # to run this workflow manually from the Actions tab
+  workflow_dispatch:
+jobs:
+  sync-to-hub:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v3
+        with:
+          fetch-depth: 0
+          lfs: true
+      - name: Push to hub
+        env:
+          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+        run: git push --force https://HF_USERNAME:$HF_TOKEN@huggingface.co/spaces/HF_USERNAME/SPACE_NAME main
 ```
+The Repo is now connected and synced with HuggingFace space!
+## 3. Create the app
+Modify [app.py](app.py) so that it takes in one text and generate an analysis using one of the provided models. Details are explained in comment lines. The app should look like this:
+![](milestone2/app_UI.png)
+## Reference:
+For connecting Github with HuggingFace, check this [video](https://www.youtube.com/watch?v=8hOzsFETm4I).
+For creating the app, check this [video](https://www.youtube.com/watch?v=GSt00_-0ncQ)
+The HuggingFace documentation is [here](https://huggingface.co/docs), and Streamlit APIs [here](https://docs.streamlit.io/library/api-reference).

app.py CHANGED Viewed

@@ -1,40 +1,48 @@
 import streamlit as st
 from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
-def analyze(model_name, text):
     model = AutoModelForSequenceClassification.from_pretrained(model_name)
     tokenizer = AutoTokenizer.from_pretrained(model_name)
     classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
     return classifier(text)
-st.title("Sentiment Analysis App - beta")
-st.write("This app is to analyze the sentiments behind a text. \n Currently it uses \
-          pre-trained models without fine-tuning.")
 model_descrip = {
-    "distilbert-base-uncased-finetuned-sst-2-english": "This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on SST-2.\n \
         Labels: POSITIVE; NEGATIVE ",
-    "cardiffnlp/twitter-roberta-base-sentiment": "This is a roBERTa-base model trained on ~58M tweets and finetuned for sentiment analysis with the TweetEval benchmark.\n \
         Labels: 0 -> Negative; 1 -> Neutral; 2 -> Positive",
-    "finiteautomata/bertweet-base-sentiment-analysis": "Model trained with SemEval 2017 corpus (around ~40k tweets). Base model is BERTweet, a RoBERTa model trained on English tweets. \n \
         Labels: POS; NEU; NEG"
 }
-user_input = st.text_input("Enter your text:", value="Missing Sophie.Z...")
-user_model = st.selectbox("Please select a model:",
-                          model_descrip)
 st.write("### Model Description:")
 st.write(model_descrip[user_model])
 if st.button("Analyze"):
     if not user_input:
         st.write("Please enter a text.")
     else:
         with st.spinner("Hang on.... Analyzing..."):
             result = analyze(user_model, user_input)
-            st.write(f"Result: \nLabel: {result[0]['label']} Score: {result[0]['score']}")
 else:
     st.write("Go on! Try the app!")

 import streamlit as st
 from transformers import pipeline, AutoTokenizer, AutoModelForSequenceClassification
+# Define analyze function
+def analyze(model_name: str, text: str) -> dict:
+    '''
+    Output result of sentiment analysis of a text through a defined model
+    '''
     model = AutoModelForSequenceClassification.from_pretrained(model_name)
     tokenizer = AutoTokenizer.from_pretrained(model_name)
     classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
     return classifier(text)
+# App title
+st.title("Sentiment Analysis App - Milestone2")
+st.write("This app is to analyze the sentiments behind a text.")
+st.write("Currently it uses pre-trained models without fine-tuning.")
+# Model hub
 model_descrip = {
+    "distilbert-base-uncased-finetuned-sst-2-english": "This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on SST-2. \
         Labels: POSITIVE; NEGATIVE ",
+    "cardiffnlp/twitter-roberta-base-sentiment": "This is a roBERTa-base model trained on ~58M tweets and finetuned for sentiment analysis with the TweetEval benchmark. \
         Labels: 0 -> Negative; 1 -> Neutral; 2 -> Positive",
+    "finiteautomata/bertweet-base-sentiment-analysis": "Model trained with SemEval 2017 corpus (around ~40k tweets). Base model is BERTweet, a RoBERTa model trained on English tweets.  \
         Labels: POS; NEU; NEG"
 }
+user_input = st.text_input("Enter your text:", value="NYU is the better than Columbia.")
+user_model = st.selectbox("Please select a model:", model_descrip)
+# Display model information
 st.write("### Model Description:")
 st.write(model_descrip[user_model])
+# Perform analysis and print result
 if st.button("Analyze"):
     if not user_input:
         st.write("Please enter a text.")
     else:
         with st.spinner("Hang on.... Analyzing..."):
             result = analyze(user_model, user_input)
+            st.write("Result:")
+            st.write(f"Label: **{result[0]['label']}**")
+            st.write(f"Confidence Score: **{result[0]['score']}**")
 else:
     st.write("Go on! Try the app!")

milestone2/HF_token.png ADDED Viewed

milestone2/app_UI.png ADDED Viewed

milestone2/github_token.png ADDED Viewed

milestone2/new_HF_space.png ADDED Viewed