Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available:
5.38.2
metadata
title: GPT Transformer Text Generator
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: false
GPT Transformer Model
This repository contains a GPT-like transformer model built using PyTorch for natural language generation. The model is based on the architecture introduced in GPT-2, which has been trained on a custom dataset for text generation.
Model Overview
The model is a multi-layer transformer-based neural network, consisting of the following components:
- Causal Self-Attention: A core component of the transformer that performs self-attention to process the input sequence.
- MLP (Feedforward Layer): Applied to each block in the transformer, which helps the model to learn complex relationships.
- Layer Normalization: Applied before each attention and feedforward layer to stabilize training.
- Embedding Layers: Token embeddings for words and positional embeddings for the sequence.
Architecture
- Embedding Dimension (
n_embd
): 768 - Number of Attention Heads (
n_head
): 12 - Number of Layers (
n_layer
): 12 - Vocabulary Size (
vocab_size
): 50,257 - Max Sequence Length (
block_size
): 1024
The model is trained for text generation and can be fine-tuned with custom data.
Requirements
To run the model and perform inference, you will need the following dependencies:
- Python 3.7+
- PyTorch
- Gradio
- Transformers
- Tokenizers (GPT-2)
You can install the required libraries using:
pip install torch gradio transformers tiktoken