Organization Card

NGen3: Next-Generation Foundational Model

NGen3 is a production-level foundational language model inspired by state-of-the-art architectures such as GPT-4, Claude-3, and Llama 2. It is designed to be highly modular, efficient, and accessible via a flexible command-line interface (CLI). NGen3 supports multiple model variants—from 7M parameters to 1B parameters—and offers a comprehensive suite of tools for:

Tokenization: Process text from local files, URLs, or Hugging Face datasets.
Training: Train the model on tokenized data.
Sampling: Generate text from trained models.
Exporting: Save models and minimal tokenizer configurations in formats compatible with Hugging Face.
Knowledge Distillation: Train a smaller student model using a larger teacher model.
Fine-Tuning: Adapt a distilled model on conversational data (from local sources or directly from Hugging Face).

This repository provides a complete implementation of the NGen3 model along with detailed CLI commands to facilitate experimentation and research.

Model Overview

NGen3 is designed for rapid development and deployment of foundational language models. Its flexible CLI allows users to:

Tokenize Text: Convert raw text or datasets into tokenized binary format.
Train Models: Use various hyperparameter configurations based on the desired model size.
Generate Samples: Evaluate model performance and generate text samples.
Export Models: Easily export models in safetensors and JSON configurations for integration with Hugging Face tools.
Distill Models: Leverage knowledge distillation to compress larger models into efficient student variants.
Fine-Tune on Conversations: Adapt models to conversational data using both local and Hugging Face datasets.

Architecture

NGen3’s architecture is built upon the transformer decoder design. Key components include:

Token and Positional Embeddings: Learnable embeddings that encode input tokens and their positions.
Stack of Transformer Blocks: Each block contains:
- Causal Self-Attention: With multi-head attention and masking to prevent information leakage.
- MLP (Feed-Forward Network): Utilizes GELU activation for non-linearity.
- Residual Connections and Layer Normalization: Stabilize training and improve convergence.
Final Projection Layer: Maps embeddings to logits over the vocabulary.

The model supports variants with parameter counts ranging from 7M to 1B, making it adaptable for various research and production needs.

Installation

Ensure you have Python 3.8+ installed along with the following packages:

PyTorch
transformers
datasets
tqdm
safetensors (for export functionality)

Install the required packages using pip:

pip install torch transformers datasets tqdm safetensors

Collections 1

models 5

TNSA

AI & ML interests

Recent Activity

NGen3: Next-Generation Foundational Model

Table of Contents

Model Overview

Architecture

Installation

Collections 1

TNSA/NGen3-140M

TNSA/NGen3-90M

TNSA/NGen3-15M

models 5

TNSA/NGen3-140M-Instruct-v2

TNSA/NGen3-140M

TNSA/NGen3-140M-Instruct

TNSA/NGen3-90M

TNSA/NGen3-15M

datasets 1

TNSA/TCorpus

AI & ML interests

Recent Activity

Team members 1

NGen3: Next-Generation Foundational Model

Table of Contents

Model Overview

Architecture

Installation

Collections 1

models 5 Sort: Recently updated

datasets 1

models 5