File size: 8,971 Bytes
cef1936
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
#!/usr/bin/env python
# coding: utf-8

# In[1]:


import panel as pn
import requests
import pandas as pd
from textblob import TextBlob
pn.extension()
pn.extension('tabulator')
import warnings
warnings.filterwarnings('ignore')


# In[2]:


sample_text = """
Happiness is a very complicated thing. Happiness can be used both in emotional or mental state context and can vary largely from a feeling from contentment to very intense feeling of joy. It can also mean a life of satisfaction, good well-being and so many more. Happiness is a very difficult phenomenon to use words to describe as it is something that can be felt only. Happiness is very important if we want to lead a very good life. Sadly, happiness is absent from the lives of a lot of people nowadays. We all have our own very different concept of happiness. Some of us are of the opinion that we can get happiness through money, others believe they can only get true happiness in relationships, some even feel that happiness can only be gotten when they are excelling in their profession.
As we might probably know, happiness is nothing more than the state of one being content and happy. A lot of people in the past, present and some (even in the future will) have tried to define and explain what they think happiness really is. So far, the most reasonable one is the one that sees happiness as something that can only come from within a person and should not be sought for outside in the world.
Some very important points about happiness are discussed below:
1. Happiness can’t be bought with Money:
A lot of us try to find happiness where it is not. We associate and equate money with happiness. If at all there is happiness in money then all of the rich people we have around us would never feel sad. What we have come to see is that even the rich amongst us are the ones that suffer depression, relationship problems, stress, fear and even anxiousness. A lot of celebrities and successful people have committed suicide, this goes a long way to show that money or fame does not guarantee happiness. This does not mean that it is a bad thing to be rich and go after money. When you have money, you can afford many things that can make you and those around you very happy.
2. Happiness can only come from within:
There is a saying that explains that one can only get true happiness when one comes to the realisation that only one can make himself/herself happy. We can only find true happiness within ourselves and we can’t find it in other people. This saying and its meaning is always hammered on in different places but we still refuse to fully understand it and put it into good use. It is very important that we understand that happiness is nothing more than the state of a person’s mind. Happiness cannot come from all the physical things we see around us. Only we through our positive emotions that we can get through good thoughts have the ability to create true happiness.
Our emotions are created by our thoughts. Therefore, it is very important that we work on having only positive thoughts and this can be achieved when we see life in a positive light."""


# In[3]:


# from nltk.corpus import stopwords
# stoplist = stopwords.words('english') + ['though']
stoplist = ['i',
 'me',
 'my',
 'myself',
 'we',
 'our',
 'ours',
 'ourselves',
 'you',
 "you're",
 "you've",
 "you'll",
 "you'd",
 'your',
 'yours',
 'yourself',
 'yourselves',
 'he',
 'him',
 'his',
 'himself',
 'she',
 "she's",
 'her',
 'hers',
 'herself',
 'it',
 "it's",
 'its',
 'itself',
 'they',
 'them',
 'their',
 'theirs',
 'themselves',
 'what',
 'which',
 'who',
 'whom',
 'this',
 'that',
 "that'll",
 'these',
 'those',
 'am',
 'is',
 'are',
 'was',
 'were',
 'be',
 'been',
 'being',
 'have',
 'has',
 'had',
 'having',
 'do',
 'does',
 'did',
 'doing',
 'a',
 'an',
 'the',
 'and',
 'but',
 'if',
 'or',
 'because',
 'as',
 'until',
 'while',
 'of',
 'at',
 'by',
 'for',
 'with',
 'about',
 'against',
 'between',
 'into',
 'through',
 'during',
 'before',
 'after',
 'above',
 'below',
 'to',
 'from',
 'up',
 'down',
 'in',
 'out',
 'on',
 'off',
 'over',
 'under',
 'again',
 'further',
 'then',
 'once',
 'here',
 'there',
 'when',
 'where',
 'why',
 'how',
 'all',
 'any',
 'both',
 'each',
 'few',
 'more',
 'most',
 'other',
 'some',
 'such',
 'no',
 'nor',
 'not',
 'only',
 'own',
 'same',
 'so',
 'than',
 'too',
 'very',
 's',
 't',
 'can',
 'will',
 'just',
 'don',
 "don't",
 'should',
 "should've",
 'now',
 'd',
 'll',
 'm',
 'o',
 're',
 've',
 'y',
 'ain',
 'aren',
 "aren't",
 'couldn',
 "couldn't",
 'didn',
 "didn't",
 'doesn',
 "doesn't",
 'hadn',
 "hadn't",
 'hasn',
 "hasn't",
 'haven',
 "haven't",
 'isn',
 "isn't",
 'ma',
 'mightn',
 "mightn't",
 'mustn',
 "mustn't",
 'needn',
 "needn't",
 'shan',
 "shan't",
 'shouldn',
 "shouldn't",
 'wasn',
 "wasn't",
 'weren',
 "weren't",
 'won',
 "won't",
 'wouldn',
 "wouldn't",
 'though']


# In[4]:


def get_sentiment(text):
    return pn.pane.Markdown(f"""
    Polarity (range from -1 negative to 1 positive): {TextBlob(text).polarity} \n
    Subjectivity (range from 0 objective to 1 subjective): {TextBlob(text).subjectivity}
    """)


# In[5]:


def get_ngram(text):
    from sklearn.feature_extraction.text import CountVectorizer
    c_vec = CountVectorizer(stop_words=stoplist, ngram_range=(2,3))
    # matrix of ngrams
    try:
        ngrams = c_vec.fit_transform([text])
    except ValueError: # if less than 2 words, return empty result
        return pn.widgets.Tabulator(width=600)
    # count frequency of ngrams
    count_values = ngrams.toarray().sum(axis=0)
    # list of ngrams
    vocab = c_vec.vocabulary_
    df_ngram = pd.DataFrame(sorted([(count_values[i],k) for k,i in vocab.items()], reverse=True)
                ).rename(columns={0: 'frequency', 1:'bigram/trigram'})
    df_ngram['polarity'] = df_ngram['bigram/trigram'].apply(lambda x: TextBlob(x).polarity)
    df_ngram['subjective'] = df_ngram['bigram/trigram'].apply(lambda x: TextBlob(x).subjectivity)
    return pn.widgets.Tabulator(df_ngram, width=600, height=300)


# In[6]:


def get_ntopics(text, ntopics):
    from sklearn.feature_extraction.text import TfidfVectorizer
    from sklearn.decomposition import NMF
    from sklearn.pipeline import make_pipeline
    tfidf_vectorizer = TfidfVectorizer(stop_words=stoplist, ngram_range=(2,3))
    nmf = NMF(n_components=ntopics)
    pipe = make_pipeline(tfidf_vectorizer, nmf)
    try:
        pipe.fit([text])
    except ValueError: # if less than 2 words, return empty result
        return
    message = ""
    for topic_idx, topic in enumerate(nmf.components_):
        message += "####Topic #%d: " % topic_idx
        message += ", ".join([tfidf_vectorizer.get_feature_names()[i]
                             for i in topic.argsort()[:-3 - 1:-1]])
        message += "\n"
    return pn.pane.Markdown(message)


# In[7]:


explanation = pn.pane.Markdown("""
This app provides a simple text analysis for a given input text or text file. \n
- Sentiment analysis uses [TextBlob](https://textblob.readthedocs.io/).
- N-gram analysis uses [scikit-learn](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html) to see which words show up together.
- Topic modeling uses [scikit-learn](https://scikit-learn.org/stable/auto_examples/applications/plot_topics_extraction_with_nmf_lda.html) NMF model and we can change the number of topics we'd like to see in the result.
""")

def get_text_results(_):
    return pn.Column(
        explanation,
        pn.pane.Markdown("""
        ##Sentiment analysis:"""),
        get_sentiment(text_widget.value.replace("\n", "")),
        pn.pane.Markdown("##N-gram analysis (bigram/trigram):"),
        get_ngram(text_widget.value.replace("\n", "")),
        pn.pane.Markdown("##Topic modeling:"),
        get_ntopics(text_widget.value.replace("\n", ""), ntopics_widget.value)
    )


# In[8]:


button = pn.widgets.Button(name="Click me to run!")


# In[9]:


file_input_widget = pn.widgets.FileInput()
def update_text_widget(event):
    text_widget.value = event.new.decode("utf-8")
# when the value of file_input_widget changes, 
# run this function to update the text of the text widget
file_input_widget.param.watch(update_text_widget, "value");


# In[10]:


text_widget = pn.widgets.TextAreaInput(value=sample_text, height=300, name='Add text')


# In[11]:


ntopics_widget = pn.widgets.IntSlider(name='Number of topics', start=2, end=10, step=1, value=3)


# In[12]:


interactive = pn.bind(get_text_results, button)


# Layout using Template
template = pn.template.FastListTemplate(
    title='Simple Text Analysis', 
    sidebar=[
        button,
        ntopics_widget, 
        text_widget, 
        "Upload a text file",
        file_input_widget
    ],
    main=[pn.panel(interactive, loading_indicator=True)],
    accent_base_color="#88d8b0",
    header_background="#88d8b0",
)
template.servable()


# In[ ]: