Instructions to use TNSA/NGen2-30M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use TNSA/NGen2-30M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="TNSA/NGen2-30M")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("TNSA/NGen2-30M", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use TNSA/NGen2-30M with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "TNSA/NGen2-30M" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TNSA/NGen2-30M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/TNSA/NGen2-30M
- SGLang
How to use TNSA/NGen2-30M with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "TNSA/NGen2-30M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TNSA/NGen2-30M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "TNSA/NGen2-30M" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "TNSA/NGen2-30M", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use TNSA/NGen2-30M with Docker Model Runner:
docker model run hf.co/TNSA/NGen2-30M
license: other
license_name: ngen2-community-license
license_link: https://tnsaai-builds.framer.website/community/licenses/ngen2
language:
- en
- hi
- te
metrics:
- bleu
- perplexity
- accuracy
base_model:
- TNSA/NGen2-15M
pipeline_tag: text-generation
library_name: transformers
model_type: safetensors
new_version: TNSA/NGen3-15M
NGen 2
While using with transformers you can only use the 15M variant for now.
NGen 2 is an advanced Transformer model training pipeline that supports multiple model variants. It ranges from a nano variant (approximately 120M parameters) to a foundational variant (approximately 1B parameters). The pipeline incorporates modern architectural improvements such as rotary positional embeddings, RMSNorm, and GEGLU activations to boost performance and training efficiency.
Note: Although NGen 3 is designed to train a 1B-parameter model, its advanced architecture pushes its performance closer to that of much larger models.
Model Variants
NGen 2 supports the following variants via the --variant flag:
- nano: ~120M parameters
- small: ~300M parameters
- medium: ~500M parameters
- large: ~700M parameters
- foundational: ~1B parameters
Each variant adjusts key hyperparameters such as the number of layers, model dimension (d_model), number of attention heads (n_heads), and the feed-forward dimension (d_ff).
Requirements
- Python 3.8+
- PyTorch
- Transformers
- Datasets
- DeepSpeed (optional, for efficient training)
- Azure ML SDK (for distributed training on Azure)
Install dependencies using pip (adjust as needed):
pip install torch transformers datasets deepspeed azureml-core
Usage
1. Data Preparation
First, download and preprocess the OpenWebText dataset:
python prepare.py --output_dir ./_data_ --max_length 4096
This script downloads, tokenizes, and saves the dataset in Arrow format to the ./data directory.
2. Local Training
The main training script is train.py. It loads the processed dataset (by default from ./data), instantiates the desired model variant, and starts training.
Example CLI Commands
- Train the nano (120M) variant:
python train.py --dataset_dir ./_data_ --output_dir ./checkpoints_nano --batch_size 4 --epochs 3 --variant nano
- Train the small (300M) variant:
python train.py --dataset_dir ./_data_ --output_dir ./checkpoints_small --batch_size 4 --epochs 3 --variant small
- Train the medium (500M) variant:
python train.py --dataset_dir ./_data_ --output_dir ./checkpoints_medium --batch_size 4 --epochs 3 --variant medium
- Train the large (700M) variant:
python train.py --dataset_dir ./_data_ --output_dir ./checkpoints_large --batch_size 4 --epochs 3 --variant large
- Train the foundational (1B) variant with rotary embeddings enabled:
python train.py --dataset_dir ./_data_ --output_dir ./checkpoints_foundational --batch_size 4 --epochs 3 --variant foundational --use_rotary
3. Training on Azure ML
- Step 1: Set Up Azure ML Resources
Use azure_setup.py to create or connect to your Azure ML workspace and set up a compute cluster:
python azure_setup.py \
--workspace_name MyWorkspace \
--resource_group MyResourceGroup \
--subscription_id YOUR_SUBSCRIPTION_ID \
--location eastus \
--compute_name gpu-cluster \
--vm_size Standard_NC6 \
--max_nodes 4 \
--min_nodes 0
- Step 2: Submit a Training Job to Azure ML
Use
submit_train.pyto submit your training script to Azure ML:
python submit_train.py \
--experiment_name ngen3-experiment \
--compute_target gpu-cluster \
--script train.py \
--dataset_dir ./_data_ \
--output_dir ./checkpoints_foundational \
--batch_size 4 \
--epochs 3 \
--variant foundational \
--use_rotary
4. DeepSpeed Integration
The deepspeed.json file configures mixed-precision training and ZeRO optimizations. To leverage DeepSpeed, ensure it is installed and adjust your training script or submission command to enable DeepSpeed support.
License
License The NGen 2 project is developed and maintained by TNSA AI. The licensing model is dual:
- The nano and small variants are open source and released under the MIT License.
- The medium, large, and foundational variants are proprietary and are not open source. Use of these proprietary components is subject to TNSA AI's proprietary licensing terms.
Copyright
© 2023 TNSA AI. All rights reserved. for Use read LICENCE in the LICENSE file