metadata

title: GPT Transformer Text Generator
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: false

GPT Transformer Model

This repository contains a GPT-like transformer model built using PyTorch for natural language generation. The model is based on the architecture introduced in GPT-2, which has been trained on a custom dataset for text generation.

Model Overview

The model is a multi-layer transformer-based neural network, consisting of the following components:

Causal Self-Attention: A core component of the transformer that performs self-attention to process the input sequence.
MLP (Feedforward Layer): Applied to each block in the transformer, which helps the model to learn complex relationships.
Layer Normalization: Applied before each attention and feedforward layer to stabilize training.
Embedding Layers: Token embeddings for words and positional embeddings for the sequence.

Architecture

Embedding Dimension (n_embd): 768
Number of Attention Heads (n_head): 12
Number of Layers (n_layer): 12
Vocabulary Size (vocab_size): 50,257
Max Sequence Length (block_size): 1024

The model is trained for text generation and can be fine-tuned with custom data.

Requirements

To run the model and perform inference, you will need the following dependencies:

Python 3.7+
PyTorch
Gradio
Transformers
Tokenizers (GPT-2)

You can install the required libraries using:

pip install torch gradio transformers tiktoken