Spaces:
Sleeping
Sleeping
import streamlit as st | |
from transformers import pipeline | |
import torch | |
# Page title | |
st.title('Transformers and Pretrained Models in NLP') | |
# Transformer Architecture: Attention is All You Need | |
st.header('1. Transformer Architecture') | |
st.subheader('Definition:') | |
st.write(""" | |
The **Transformer architecture** revolutionized NLP by using a mechanism called **self-attention** to handle sequences of data in parallel. | |
This architecture eliminates the need for recurrent structures like RNNs and LSTMs, allowing for faster training and better handling of long-range dependencies. | |
- **Self-attention** allows each word in the sequence to focus on every other word and assign weights based on their importance. | |
- The **encoder-decoder** structure is used in tasks like translation, where the encoder processes input sequences and the decoder generates the output. | |
The Transformer model was introduced in the paper "**Attention is All You Need**" (Vaswani et al., 2017), which was a groundbreaking contribution to NLP. | |
""") | |
st.subheader('Key Components of the Transformer:') | |
st.write(""" | |
- **Encoder**: The encoder processes input tokens and generates an internal representation of the sequence. | |
- **Decoder**: The decoder uses the encoder's representation to generate the output sequence, word by word. | |
- **Multi-head Attention**: Allows the model to focus on different parts of the sequence at the same time, improving the model's understanding. | |
- **Positional Encoding**: Since transformers process tokens in parallel, positional encoding is used to give the model information about the order of words in a sequence. | |
""") | |
# Pretrained Models: BERT, GPT, RoBERTa, ALBERT, T5, XLNet, etc. | |
st.header('2. Pretrained Models') | |
st.subheader('Definition:') | |
st.write(""" | |
**Pretrained models** are models that have been trained on large corpora and can be fine-tuned for specific NLP tasks. These models have learned general language patterns and can be adapted for specific applications. | |
- **BERT (Bidirectional Encoder Representations from Transformers)**: BERT uses a bidirectional approach to learn from both the left and right context of a word. It is trained for tasks like question answering, sentiment analysis, and named entity recognition (NER). | |
- **GPT (Generative Pre-trained Transformer)**: GPT is a unidirectional model trained to predict the next word in a sentence. It excels in text generation tasks. | |
- **RoBERTa**: A robustly optimized version of BERT, which improves upon BERT by training on more data and for longer periods. | |
- **ALBERT**: A smaller and more efficient version of BERT with fewer parameters but maintaining similar performance. | |
- **T5**: A text-to-text framework where all tasks are framed as converting input text into target text (e.g., translation, summarization). | |
- **XLNet**: An autoregressive model that captures dependencies across different positions in a sequence. | |
- **DistilBERT**: A smaller, faster, and cheaper version of BERT while retaining much of its performance. | |
- **BioBERT**: A variant of BERT specifically fine-tuned on biomedical text for tasks like named entity recognition and relation extraction. | |
""") | |
# Example: Using a Pretrained Model (BERT for Sentiment Analysis) | |
st.subheader('Pretrained Model Example: BERT for Sentiment Analysis') | |
# Use the pipeline API from Hugging Face | |
model_name = "bert-base-uncased" | |
nlp = pipeline("sentiment-analysis", model=model_name) | |
text = st.text_area("Enter a text for sentiment analysis", "Transformers are amazing!") | |
if st.button('Analyze Sentiment'): | |
result = nlp(text) | |
st.write(f"Sentiment Analysis Result: {result}") | |
# Fine-tuning Pretrained Models: Sentiment Analysis, Named Entity Recognition, Question Answering | |
st.header('3. Fine-tuning Pretrained Models') | |
st.subheader('Definition:') | |
st.write(""" | |
**Fine-tuning** is the process of taking a pretrained model and training it on a specific task. This allows the model to adapt to the specifics of the task without needing to be trained from scratch. | |
- **Sentiment Analysis**: Determining whether a text expresses a positive, negative, or neutral sentiment. | |
- **Named Entity Recognition (NER)**: Identifying entities such as names, locations, organizations, dates, etc., in a text. | |
- **Question Answering**: Given a question and a context, the model finds the answer in the context. | |
Fine-tuning involves training the model on a smaller, task-specific dataset while keeping the pretrained weights intact. | |
""") | |
# Fine-tuning Example 1: Sentiment Analysis (Already handled above with BERT) | |
# Fine-tuning Example 2: Named Entity Recognition (NER) | |
st.subheader('NER Example (Named Entity Recognition with BERT)') | |
# Use a pretrained NER pipeline | |
nlp_ner = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english") | |
text_ner = st.text_area("Enter a sentence for Named Entity Recognition", "Barack Obama was born in Hawaii.") | |
if st.button('Perform NER'): | |
ner_results = nlp_ner(text_ner) | |
st.write("Named Entity Recognition Results:") | |
for entity in ner_results: | |
st.write(f"{entity['word']} - {entity['entity']} - Confidence: {entity['score']:.2f}") | |
# Fine-tuning Example 3: Question Answering | |
st.subheader('Question Answering Example (BERT)') | |
# Use a pretrained Question Answering pipeline | |
nlp_qa = pipeline("question-answering", model="bert-large-uncased-whole-word-masking-finetuned-squad") | |
context = st.text_area("Enter a context paragraph", | |
"Transformers have revolutionized the field of NLP by providing more efficient models for text classification, generation, and other tasks.") | |
question = st.text_input("Enter a question related to the context", "What have transformers revolutionized?") | |
if st.button('Get Answer'): | |
answer = nlp_qa(question=question, context=context) | |
st.write(f"Answer: {answer['answer']}") | |