abdullahmubeen10 commited on
Commit
f7b8b91
1 Parent(s): 50a4ed9

Upload 6 files

Browse files
.streamlit/config.toml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ [theme]
2
+ base="light"
3
+ primaryColor="#29B4E8"
Demo.py ADDED
@@ -0,0 +1,121 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import sparknlp
3
+ import os
4
+ import pandas as pd
5
+
6
+ from sparknlp.base import *
7
+ from sparknlp.annotator import *
8
+ from pyspark.ml import Pipeline
9
+ from sparknlp.pretrained import PretrainedPipeline
10
+
11
+ # Page configuration
12
+ st.set_page_config(
13
+ layout="wide",
14
+ page_title="Spark NLP Demos App",
15
+ initial_sidebar_state="auto"
16
+ )
17
+
18
+ # CSS for styling
19
+ st.markdown("""
20
+ <style>
21
+ .main-title {
22
+ font-size: 36px;
23
+ color: #4A90E2;
24
+ font-weight: bold;
25
+ text-align: center;
26
+ }
27
+ .section p, .section ul {
28
+ color: #666666;
29
+ }
30
+ </style>
31
+ """, unsafe_allow_html=True)
32
+
33
+ @st.cache_resource
34
+ def init_spark():
35
+ return sparknlp.start()
36
+
37
+ @st.cache_resource
38
+ def create_pipeline():
39
+ document_assembler = DocumentAssembler()\
40
+ .setInputCol("text")\
41
+ .setOutputCol("document")
42
+
43
+ tokenizer = RecursiveTokenizer()\
44
+ .setInputCols(["document"])\
45
+ .setOutputCol("token")\
46
+ .setPrefixes(["\"", "(", "[", "\n"])\
47
+ .setSuffixes([".", ",", "?", ")","!", "‘s"])
48
+
49
+ spell_model = ContextSpellCheckerModel\
50
+ .pretrained('spellcheck_dl')\
51
+ .setInputCols("token")\
52
+ .setOutputCol("corrected")
53
+
54
+ light_pipeline = Pipeline(stages = [document_assembler, tokenizer, spell_model])
55
+
56
+ return light_pipeline
57
+
58
+ def fit_data(pipeline, data):
59
+ empty_df = spark.createDataFrame([['']]).toDF('text')
60
+ pipeline_model = pipeline.fit(empty_df)
61
+ model = LightPipeline(pipeline_model)
62
+ result = model.annotate(data)
63
+
64
+ return result
65
+
66
+ # Set up the page layout
67
+ st.markdown('<div class="main-title">Spell Check Your Text Documents With Spark NLP</div>', unsafe_allow_html=True)
68
+
69
+ # Sidebar content
70
+ model = st.sidebar.selectbox(
71
+ "Choose the pretrained model",
72
+ ["spellcheck_dl"],
73
+ help="For more info about the models visit: https://sparknlp.org/models"
74
+ )
75
+
76
+ # Reference notebook link in sidebar
77
+ link = """
78
+ <a href="https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/SPELL_CHECKER_EN.ipynb">
79
+ <img src="https://colab.research.google.com/assets/colab-badge.svg" style="zoom: 1.3" alt="Open In Colab"/>
80
+ </a>
81
+ """
82
+ st.sidebar.markdown('Reference notebook:')
83
+ st.sidebar.markdown(link, unsafe_allow_html=True)
84
+
85
+ # Load examples
86
+ examples = [
87
+ "Apollo 11 was the space fleght that landed the frrst humans, Americans Neil Armstrong and Buzz Aldrin, on the Mon on july 20, 1969, at 20:18 UTC. Armstrong beceme the first to stp onto the luner surface 6 hours later on July 21 at 02:56 UTC. Armstrong spent abut three and a half two and a hakf hours outside the spacecraft, Aldrin slghtely less; and tgether they colected 47.5 pounds (21.5 kg) of lunar material for returne to Earth. A third member of the mission, Michael Collins, pilloted the comand spacecraft alone in lunar orbit untl Armstrong and Aldrin returned to it for the trep back to Earth.",
88
+ "Set theory is a branch of mathematical logic that studyes sets, which informally are colections of objects. Although any type of object can be collected into a set, set theory is applyed most often to objects that are relevant to mathematics. The language of set theory can be used to define nearly all mathematical objects. Set theory is commonly employed as a foundational system for mathematics. Beyond its foundational role, set theory is a branch of mathematics in its own right, with an active resurch community. Contemporary resurch into set theory includes a divers colection of topics, ranging from the structer of the real number line to the study of the consistency of large cardinals.",
89
+ "In mathematics and transporation enginering, traffic flow is the study of interctions between travellers (including podestrians, ciclists, drivers, and their vehicles) and inferstructure (including highways, signage, and traffic control devices), with the aim of understanding and developing an optimal transport network with eficiant movement of traffic and minimal traffic congestion problems. Current traffic models use a mixture of emperical and theoretical techniques. These models are then developed into traffic forecasts, and take account of proposed local or major changes, such as incrased vehicle use, changes in land use or changes in mode of transport (with people moving from bus to train or car, for example), and to identify areas of congestion where the network needs to be ajusted.",
90
+ "Critical theory is a social pholosiphy pertaning to the reflective asessment and critique of society and culture in order to reveal and challange power structures. With origins in socology, as well as in literary criticism, it argues that social problems are influenced and created more by societal structures and cultural assumptions than by individual and psychological factors. Mantaining that ideology is the principal obsticle to human liberation, critical theory was establiched as a school of thought primarily by the Frankfurt School theoreticians. One sociologist described a theory as critical insofar as it seeks 'to liberate human beings from the circomstances that enslave them.",
91
+ "In computitional languistics, lemmatisation is the algorhythmic process of determining the lemma of a word based on its intended meaning. Unlike stemming, lemmatisation depends on corectly identifing the intended part of speech and meaning of a word in a sentence, as well as within the larger context surounding that sentence, such as neigboring sentences or even an entire document. As a result, devleoping efficient lemmatisation algorithums is an open area of research. In many langauges, words appear in several inflected forms. For example, in English, the verb 'to walk' may appear as 'walk', 'walked', 'walks' or 'walking'. The base form, 'walk', that one might look up in a dictionery, is called the lemma for the word. The asociation of the base form with a part of speech is often called a lexeme of the word."
92
+ ]
93
+
94
+ selected_text = st.selectbox("Select an example", examples)
95
+ custom_input = st.text_input("Try it for yourself!")
96
+
97
+ if custom_input:
98
+ selected_text = custom_input
99
+ elif selected_text:
100
+ selected_text = selected_text
101
+
102
+ st.subheader("Selected Text:")
103
+ st.write(selected_text)
104
+
105
+ # Initialize Spark and create pipeline
106
+ spark = init_spark()
107
+ pipeline = create_pipeline(model)
108
+ output = fit_data(pipeline, selected_text)
109
+
110
+ # Display output sentence
111
+ correction_dict_filtered = {token: corrected for token, corrected in zip(output['token'], output['corrected']) if token != corrected}
112
+
113
+ def generate_html_with_corrections(sentence, corrections):
114
+ corrected_html = sentence
115
+ for incorrect, correct in corrections.items():
116
+ corrected_html = corrected_html.replace(incorrect, f'<span style="text-decoration: line-through; color: red;">{incorrect}</span> <span style="color: green;">{correct}</span>')
117
+ return f'<div style="font-family: Arial, sans-serif; font-size: 16px;">{corrected_html}</div>'
118
+
119
+ corrected_html_snippet = generate_html_with_corrections(selected_text, correction_dict_filtered)
120
+ st.subheader(f"Misspells Detected:")
121
+ st.markdown(corrected_html_snippet, unsafe_allow_html=True)
Dockerfile ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Download base image ubuntu 18.04
2
+ FROM ubuntu:18.04
3
+
4
+ # Set environment variables
5
+ ENV NB_USER jovyan
6
+ ENV NB_UID 1000
7
+ ENV HOME /home/${NB_USER}
8
+
9
+ # Install required packages
10
+ RUN apt-get update && apt-get install -y \
11
+ tar \
12
+ wget \
13
+ bash \
14
+ rsync \
15
+ gcc \
16
+ libfreetype6-dev \
17
+ libhdf5-serial-dev \
18
+ libpng-dev \
19
+ libzmq3-dev \
20
+ python3 \
21
+ python3-dev \
22
+ python3-pip \
23
+ unzip \
24
+ pkg-config \
25
+ software-properties-common \
26
+ graphviz \
27
+ openjdk-8-jdk \
28
+ ant \
29
+ ca-certificates-java \
30
+ && apt-get clean \
31
+ && update-ca-certificates -f;
32
+
33
+ # Install Python 3.8 and pip
34
+ RUN add-apt-repository ppa:deadsnakes/ppa \
35
+ && apt-get update \
36
+ && apt-get install -y python3.8 python3-pip \
37
+ && apt-get clean;
38
+
39
+ # Set up JAVA_HOME
40
+ ENV JAVA_HOME /usr/lib/jvm/java-8-openjdk-amd64/
41
+ RUN mkdir -p ${HOME} \
42
+ && echo "export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/" >> ${HOME}/.bashrc \
43
+ && chown -R ${NB_UID}:${NB_UID} ${HOME}
44
+
45
+ # Create a new user named "jovyan" with user ID 1000
46
+ RUN useradd -m -u ${NB_UID} ${NB_USER}
47
+
48
+ # Switch to the "jovyan" user
49
+ USER ${NB_USER}
50
+
51
+ # Set home and path variables for the user
52
+ ENV HOME=/home/${NB_USER} \
53
+ PATH=/home/${NB_USER}/.local/bin:$PATH
54
+
55
+ # Set the working directory to the user's home directory
56
+ WORKDIR ${HOME}
57
+
58
+ # Upgrade pip and install Python dependencies
59
+ RUN python3.8 -m pip install --upgrade pip
60
+ COPY requirements.txt /tmp/requirements.txt
61
+ RUN python3.8 -m pip install -r /tmp/requirements.txt
62
+
63
+ # Copy the application code into the container at /home/jovyan
64
+ COPY --chown=${NB_USER}:${NB_USER} . ${HOME}
65
+
66
+ # Expose port for Streamlit
67
+ EXPOSE 7860
68
+
69
+ # Define the entry point for the container
70
+ ENTRYPOINT ["streamlit", "run", "Demo.py", "--server.port=7860", "--server.address=0.0.0.0"]
images/johnsnowlabs-output.png ADDED
pages/Workflow & Model Overview.py ADDED
@@ -0,0 +1,234 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+
3
+ # Custom CSS for better styling
4
+ st.markdown("""
5
+ <style>
6
+ .main-title {
7
+ font-size: 36px;
8
+ color: #4A90E2;
9
+ font-weight: bold;
10
+ text-align: center;
11
+ }
12
+ .sub-title {
13
+ font-size: 24px;
14
+ color: #4A90E2;
15
+ margin-top: 20px;
16
+ }
17
+ .section {
18
+ background-color: #f9f9f9;
19
+ padding: 15px;
20
+ border-radius: 10px;
21
+ margin-top: 20px;
22
+ }
23
+ .section h2 {
24
+ font-size: 22px;
25
+ color: #4A90E2;
26
+ }
27
+ .section p, .section ul {
28
+ color: #666666;
29
+ }
30
+ .link {
31
+ color: #4A90E2;
32
+ text-decoration: none;
33
+ }
34
+ .sidebar-content {
35
+ font-size: 16px;
36
+ }
37
+ </style>
38
+ """, unsafe_allow_html=True)
39
+
40
+ # Introduction
41
+ st.markdown('<div class="main-title">Correcting Typos and Spelling Errors with Spark NLP and Python</div>', unsafe_allow_html=True)
42
+
43
+ st.markdown("""
44
+ <div class="section">
45
+ <p>Correcting typos and spelling errors is an essential task in NLP pipelines. Ensuring data correctness can significantly improve the performance of machine learning models. In this article, we will explore how to perform spell checking using rule-based and machine learning-based models in Spark NLP with Python.</p>
46
+ </div>
47
+ """, unsafe_allow_html=True)
48
+
49
+ # Background
50
+ st.markdown('<div class="sub-title">Introduction</div>', unsafe_allow_html=True)
51
+ st.markdown("""
52
+ <div class="section">
53
+ <p>Spell checking identifies words in texts that have spelling errors or are misspelled. Text data from social media or extracted using Optical Character Recognition (OCR) often contains typos, misspellings, or spurious symbols that can impact machine learning models.</p>
54
+ <p>Having spelling errors in data can reduce model performance. For example, if "John" appears as "J0hn", the model treats them as two separate words, complicating the model and reducing its effectiveness. Spell checking and correction can preprocess data to improve model training.</p>
55
+ </div>
56
+ """, unsafe_allow_html=True)
57
+
58
+ # Spell Checking in Spark NLP
59
+ st.markdown('<div class="sub-title">Spell Checking in Spark NLP</div>', unsafe_allow_html=True)
60
+ st.markdown("""
61
+ <div class="section">
62
+ <p>Spark NLP provides three approaches for spell checking and correction:</p>
63
+ <ul>
64
+ <li><strong>NorvigSweetingAnnotator:</strong> Based on Peter Norvig’s algorithm with modifications like limiting vowel swapping and using Hamming distance.</li>
65
+ <li><strong>SymmetricDeleteAnnotator:</strong> Based on the SymSpell algorithm.</li>
66
+ <li><strong>ContextSpellCheckerAnnotator:</strong> A deep learning model using contextual information for error detection and correction.</li>
67
+ </ul>
68
+ </div>
69
+ """, unsafe_allow_html=True)
70
+
71
+ # Example Code
72
+ st.markdown('<div class="sub-title">Example Code</div>', unsafe_allow_html=True)
73
+ st.markdown('<p>Here is an example of how to use these models in Spark NLP:</p>', unsafe_allow_html=True)
74
+
75
+ # Step-by-step code
76
+ st.markdown('<div class="sub-title">Setup</div>', unsafe_allow_html=True)
77
+ st.markdown('<p>To install Spark NLP in Python, use your favorite package manager (conda, pip, etc.). For example:</p>', unsafe_allow_html=True)
78
+ st.code("""
79
+ pip install spark-nlp
80
+ pip install pyspark
81
+ """, language="bash")
82
+
83
+ st.markdown('<p>Then, import Spark NLP and start a Spark session:</p>', unsafe_allow_html=True)
84
+ st.code("""
85
+ import sparknlp
86
+
87
+ # Start Spark Session
88
+ spark = sparknlp.start()
89
+ """, language='python')
90
+
91
+ # Step 1: Document Assembler
92
+ st.markdown('<div class="sub-title">Step 1: Document Assembler</div>', unsafe_allow_html=True)
93
+ st.markdown('<p>Transform raw texts to document annotation:</p>', unsafe_allow_html=True)
94
+ st.code("""
95
+ from sparknlp.base import DocumentAssembler
96
+
97
+ documentAssembler = DocumentAssembler()\\
98
+ .setInputCol("text")\\
99
+ .setOutputCol("document")
100
+ """, language='python')
101
+
102
+ # Step 2: Tokenization
103
+ st.markdown('<div class="sub-title">Step 2: Tokenization</div>', unsafe_allow_html=True)
104
+ st.markdown('<p>Split text into individual tokens:</p>', unsafe_allow_html=True)
105
+ st.code("""
106
+ from sparknlp.annotator import Tokenizer
107
+
108
+ tokenizer = Tokenizer()\\
109
+ .setInputCols(["document"])\\
110
+ .setOutputCol("token")
111
+ """, language='python')
112
+
113
+ # Step 3: Spell Checker Models
114
+ st.markdown('<div class="sub-title">Step 3: Spell Checker Models</div>', unsafe_allow_html=True)
115
+ st.markdown('<p>Choose and load one of the spell checker models:</p>', unsafe_allow_html=True)
116
+
117
+ st.code("""
118
+ from sparknlp.annotator import ContextSpellCheckerModel, NorvigSweetingModel, SymmetricDeleteModel
119
+
120
+ # One of the spell checker annotators
121
+ symspell = SymmetricDeleteModel.pretrained("spellcheck_sd")\\
122
+ .setInputCols(["token"])\\
123
+ .setOutputCol("symspell")
124
+
125
+ norvig = NorvigSweetingModel.pretrained("spellcheck_norvig")\\
126
+ .setInputCols(["token"])\\
127
+ .setOutputCol("norvig")
128
+
129
+ context = ContextSpellCheckerModel.pretrained("spellcheck_dl")\\
130
+ .setInputCols(["token"])\\
131
+ .setOutputCol("context")
132
+ """, language='python')
133
+
134
+ # Step 4: Pipeline Definition
135
+ st.markdown('<div class="sub-title">Step 4: Pipeline Definition</div>', unsafe_allow_html=True)
136
+ st.markdown('<p>Define the pipeline stages:</p>', unsafe_allow_html=True)
137
+ st.code("""
138
+ from pyspark.ml import Pipeline
139
+
140
+ # Define the pipeline stages
141
+ pipeline = Pipeline().setStages([documentAssembler, tokenizer, symspell, norvig, context])
142
+ """, language='python')
143
+
144
+ # Step 5: Fitting and Transforming
145
+ st.markdown('<div class="sub-title">Step 5: Fitting and Transforming</div>', unsafe_allow_html=True)
146
+ st.markdown('<p>Fit the pipeline and transform the data:</p>', unsafe_allow_html=True)
147
+ st.code("""
148
+ # Create an empty DataFrame to fit the pipeline
149
+ empty_df = spark.createDataFrame([[""]]).toDF("text")
150
+ pipelineModel = pipeline.fit(empty_df)
151
+
152
+ # Example text for correction
153
+ example_df = spark.createDataFrame([["Plaese alliow me tao introdduce myhelf, I am a man of wealth und tiaste"]]).toDF("text")
154
+ result = pipelineModel.transform(example_df)
155
+ """, language='python')
156
+
157
+ # Step 6: Displaying Results
158
+ st.markdown('<div class="sub-title">Step 6: Displaying Results</div>', unsafe_allow_html=True)
159
+ st.markdown('<p>Show the results from the different spell checker models:</p>', unsafe_allow_html=True)
160
+ st.code("""
161
+ # Show results
162
+ result.selectExpr("norvig.result as norvig", "symspell.result as symspell", "context.result as context").show(truncate=False)
163
+ """, language='python')
164
+
165
+ st.markdown("""
166
+ <p>The output from the example code will show the corrected text using three different models:</p>
167
+ <table>
168
+ <tr>
169
+ <th>norvig</th>
170
+ <th>symspell</th>
171
+ <th>context</th>
172
+ </tr>
173
+ <tr>
174
+ <td>[Please, allow, me, tao, introduce, myself, ,, I, am, a, man, of, wealth, und, taste]</td>
175
+ <td>[Place, allow, me, to, introduce, myself, ,, I, am, a, man, of, wealth, und, taste]</td>
176
+ <td>[Please, allow, me, to, introduce, myself, ,, I, am, a, man, of, wealth, and, taste]</td>
177
+ </tr>
178
+ </table>
179
+ """, unsafe_allow_html=True)
180
+
181
+ # One-liner Alternative
182
+ st.markdown('<div class="sub-title">One-liner Alternative</div>', unsafe_allow_html=True)
183
+ st.markdown("""
184
+ <div class="section">
185
+ <p>Introducing the <code>johnsnowlabs</code> library: In October 2022, John Snow Labs released a unified open-source library containing all their products under one roof. This includes Spark NLP, Spark NLP Display, and NLU. Simplify your workflow with:</p>
186
+ <p><code>pip install johnsnowlabs</code></p>
187
+ <p>For spell checking, use one line of code:</p>
188
+ <pre>
189
+ <code class="language-python">
190
+ # Import the NLP module which contains Spark NLP and NLU libraries
191
+ from johnsnowlabs import nlp
192
+ # Use Norvig model
193
+ nlp.load("en.spell.norvig").predict("Plaese alliow me tao introdduce myhelf, I am a man of wealth und tiaste", output_level='token')
194
+ </code>
195
+ </pre>
196
+ </div>
197
+ """, unsafe_allow_html=True)
198
+
199
+ st.image('images/johnsnowlabs-output.png', use_column_width='auto')
200
+
201
+ # Conclusion
202
+ st.markdown("""
203
+ <div class="section">
204
+ <h2>Conclusion</h2>
205
+ <p>We introduced three models for spell checking and correction in Spark NLP: NorvigSweeting, SymmetricDelete, and ContextSpellChecker. These models can be integrated into Spark NLP pipelines for efficient processing of large datasets.</p>
206
+ </div>
207
+ """, unsafe_allow_html=True)
208
+
209
+ # References
210
+ st.markdown('<div class="sub-title">References</div>', unsafe_allow_html=True)
211
+ st.markdown("""
212
+ <div class="section">
213
+ <ul>
214
+ <li><a class="link" href="https://nlp.johnsnowlabs.com/docs/en/annotators#norvigsweeting-spellchecker" target="_blank" rel="noopener">NorvigSweeting</a> documentation page</li>
215
+ <li><a class="link" href="https://nlp.johnsnowlabs.com/docs/en/annotators#symmetricdelete-spellchecker" target="_blank" rel="noopener">SymmetricDeleter</a> documentation page</li>
216
+ <li><a class="link" href="https://nlp.johnsnowlabs.com/docs/en/annotators#contextspellchecker" target="_blank" rel="noopener">ContextSpellChecker</a> documentation page</li>
217
+ <li><a class="link" href="https://medium.com/spark-nlp/applying-context-aware-spell-checking-in-spark-nlp-3c29c46963bc" target="_blank" rel="noopener nofollow">Applying Context Aware Spell Checking in Spark NLP</a></li>
218
+ <li><a class="link" href="https://towardsdatascience.com/training-a-contextual-spell-checker-for-italian-language-66dda528e4bf" target="_blank" rel="noopener nofollow">Training a Contextual Spell Checker for Italian Language</a></li>
219
+ </ul>
220
+ </div>
221
+ """, unsafe_allow_html=True)
222
+
223
+ st.markdown('<div class="sub-title">Community & Support</div>', unsafe_allow_html=True)
224
+ st.markdown("""
225
+ <div class="section">
226
+ <ul>
227
+ <li><a class="link" href="https://sparknlp.org/" target="_blank">Official Website</a>: Documentation and examples</li>
228
+ <li><a class="link" href="https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q" target="_blank">Slack</a>: Live discussion with the community and team</li>
229
+ <li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp" target="_blank">GitHub</a>: Bug reports, feature requests, and contributions</li>
230
+ <li><a class="link" href="https://medium.com/spark-nlp" target="_blank">Medium</a>: Spark NLP articles</li>
231
+ <li><a class="link" href="https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos" target="_blank">YouTube</a>: Video tutorials</li>
232
+ </ul>
233
+ </div>
234
+ """, unsafe_allow_html=True)
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ streamlit
2
+ pandas
3
+ numpy
4
+ spark-nlp
5
+ pyspark