tiny-gpt-1-1m

This repository contains a pretrained TinyGPT checkpoint published for public use. This checkpoint is provided for educational and experimentation purposes.

Artifacts

tiny_gpt_latest.pt: training checkpoint with model and optimizer state
tokenizer.model: SentencePiece tokenizer used for training and generation
config.json: model configuration serialized from the checkpoint
training_config.yaml: training and MLflow settings used for the run

How to use

Use with Transformers.

Starting with transformers >= 4.43.0, you can run conversational inference using the pipeline abstraction or by leveraging the Auto classes with generate().

Make sure to update your Transformers installation via pip install --upgrade transformers.

import torch
import transformers

model_id = "vjkhambe/tiny-gpt-1-1m"
device = 0 if torch.cuda.is_available() else -1

model = transformers.AutoModelForCausalLM.from_pretrained(
    model_id,
    trust_remote_code=True,
    dtype=torch.bfloat16,
)
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model.generation_config.max_length = None
model.generation_config.max_new_tokens = 64

pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    device=device,
)

print(pipeline("Hey how are you doing today?"))

Training details

Base package: tiny_gpt_pretrain
Model and training configuration are stored in the checkpoint and training_config.yaml
The exported checkpoint includes optimizer state for continued fine-tuning or evaluation

License

Released under the Apache-2.0 license.

Target repo: vjkhambe/tiny-gpt-1-1m

Downloads last month: 12

Safetensors

Model size

1.61M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support