TransformerModel / README.md
Shriti09's picture
Update README.md
387b881 verified

A newer version of the Gradio SDK is available: 5.38.2

Upgrade
metadata
title: GPT Transformer Text Generator
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.12.0
app_file: app.py
pinned: false

GPT Transformer Model

This repository contains a GPT-like transformer model built using PyTorch for natural language generation. The model is based on the architecture introduced in GPT-2, which has been trained on a custom dataset for text generation.

Model Overview

The model is a multi-layer transformer-based neural network, consisting of the following components:

  • Causal Self-Attention: A core component of the transformer that performs self-attention to process the input sequence.
  • MLP (Feedforward Layer): Applied to each block in the transformer, which helps the model to learn complex relationships.
  • Layer Normalization: Applied before each attention and feedforward layer to stabilize training.
  • Embedding Layers: Token embeddings for words and positional embeddings for the sequence.

Architecture

  • Embedding Dimension (n_embd): 768
  • Number of Attention Heads (n_head): 12
  • Number of Layers (n_layer): 12
  • Vocabulary Size (vocab_size): 50,257
  • Max Sequence Length (block_size): 1024

The model is trained for text generation and can be fine-tuned with custom data.

Requirements

To run the model and perform inference, you will need the following dependencies:

  • Python 3.7+
  • PyTorch
  • Gradio
  • Transformers
  • Tokenizers (GPT-2)

You can install the required libraries using:

pip install torch gradio transformers tiktoken