Edit model card

TinyStories-Korean-800K

A tiny autoregressive language model trained from scratch.

  • Architecture: Llama
  • Vocab size: 4096
  • Hidden size: 64
  • Layers: 5
  • Heads: 8 (MHA)
  • Context length: up to 512 tokens

Note

This model was trained for experimental purposes and the generated output may not make any sense at all.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained('northwind33/TinyStories-Korean-800K')
tokenizer = AutoTokenizer.from_pretrained('northwind33/TinyStories-Korean-800K')

input_text = ''
input_ids = tokenizer(input_text, return_tensors='pt').input_ids

output = model.generate(input_ids, max_length=512, do_sample=True, temperature=0.5)
Downloads last month
24
Safetensors
Model size
791k params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train northwind33/TinyStories-Korean-800K