tiny-gpt-1-1m

This repository contains a pretrained TinyGPT checkpoint published for public use. This checkpoint is provided for educational and experimentation purposes.

Artifacts

  • tiny_gpt_latest.pt: training checkpoint with model and optimizer state
  • tokenizer.model: SentencePiece tokenizer used for training and generation
  • config.json: model configuration serialized from the checkpoint
  • training_config.yaml: training and MLflow settings used for the run

How to use

Use with Transformers.

Starting with transformers >= 4.43.0, you can run conversational inference using the pipeline abstraction or by leveraging the Auto classes with generate().

Make sure to update your Transformers installation via pip install --upgrade transformers.

import torch
import transformers

model_id = "vjkhambe/tiny-gpt-1-1m"
device = 0 if torch.cuda.is_available() else -1

model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    dtype=torch.bfloat16,
)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model.generation_config.max_length = None
model.generation_config.max_new_tokens = 64

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device=device,
)

print(pipeline("Hey how are you doing today?"))

Training details

  • Base package: tiny_gpt_pretrain
  • Model and training configuration are stored in the checkpoint and training_config.yaml
  • The exported checkpoint includes optimizer state for continued fine-tuning or evaluation

License

Released under the Apache-2.0 license.

Target repo: vjkhambe/tiny-gpt-1-1m

Downloads last month
12
Safetensors
Model size
1.61M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support