roneneldan/TinyStories
Viewer • Updated • 2.14M • 85.7k • 1.02k
TinyTale-15M is a 15 Million parameter custom GPT-2 architecture trained completely from scratch on the TinyStories dataset. This model serves as a baseline pipeline experiment for handling custom architecture initialization, offline dataset tokenization, and processing on hardware acceleration via an NVIDIA A10 GPU workstation.
roneneldan/TinyStoriesfrom transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("agentbyumer/TinyTale-15M")
model = AutoModelForCausalLM.from_pretrained("agentbyumer/TinyTale-15M")
prompt = "Once upon a time, a small puppy found a shiny key"
inputs = tokenizer(prompt, return_tensors="pt")
output = model.generate(**inputs, max_length=100, do_sample=True, temperature=0.8)
print(tokenizer.decode(output[0], skip_special_tokens=True))