Instructions to use laskar-ks/alcyone-v0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use laskar-ks/alcyone-v0 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="laskar-ks/alcyone-v0")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("laskar-ks/alcyone-v0") model = AutoModelForMultimodalLM.from_pretrained("laskar-ks/alcyone-v0") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use laskar-ks/alcyone-v0 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "laskar-ks/alcyone-v0" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "laskar-ks/alcyone-v0", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/laskar-ks/alcyone-v0
- SGLang
How to use laskar-ks/alcyone-v0 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "laskar-ks/alcyone-v0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "laskar-ks/alcyone-v0", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "laskar-ks/alcyone-v0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "laskar-ks/alcyone-v0", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use laskar-ks/alcyone-v0 with Docker Model Runner:
docker model run hf.co/laskar-ks/alcyone-v0
Alcyone v0
A ~4M parameter GPT-2 style language model, pretrained from scratch on TinyStories as an educational exercise.
Named after the brightest star in the Pleiades cluster — part of a star-named model family.
Why this exists
Most "trained a model" portfolio entries are fine-tunes of large pretrained checkpoints. This is different: every weight in this model started as a random number. The goal was to walk through the full pretraining loop end-to-end — custom BPE tokenizer, randomly initialized GPT-2 architecture, causal LM objective, training loop, evaluation — on a free Google Colab T4 GPU.
The output isn't competitive with anything. The understanding is the deliverable.
Model details
| Architecture | GPT-2 (decoder-only Transformer) |
| Parameters | ~4.22 M |
| Layers | 4 |
| Embedding dim | 256 |
| Attention heads | 4 |
| Context length | 128 tokens |
| Vocab size | 4,000 |
| Tokenizer | Byte-level BPE, trained from scratch on TinyStories |
| Initialization | Random (no pretrained weights) |
| Training objective | Causal language modeling (next-token prediction) |
Training data
roneneldan/TinyStories — short children's stories generated by GPT-3.5/GPT-4, designed specifically so small language models can learn coherent English. A 5,000-story subset was used for this v0 run.
Training setup
| Hardware | Google Colab Free Tier (NVIDIA Tesla T4, 15.6 GB VRAM) |
| Precision | fp16 |
| Optimizer | AdamW (Hugging Face Trainer default) |
| Learning rate | 3e-4, cosine schedule, 30 warmup steps |
| Weight decay | 0.01 |
| Batch size | 32 |
| Steps | 300 (capped via max_steps) |
| Wall-clock time | ~14 seconds |
| Final train loss | 5.08 (from initial ~9, vocab=4000) |
Intended use & limitations
This is a base language model, not an instruction-tuned chat model. Given a short English prompt, it will continue the text in TinyStories style (children's stories with characters like "Lily", "Tom", "Ben", simple plots).
It will NOT:
- follow instructions
- answer questions reliably
- produce text outside the children's-story domain
- maintain long-range coherence (context window is only 128 tokens)
This is v0 — explicitly an early, undercooked checkpoint. Output is occasionally repetitive and loses thread across sentences. That's expected at this scale.
Quick start
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
model = AutoModelForCausalLM.from_pretrained("laskar-ks/alcyone-v0")
tokenizer = AutoTokenizer.from_pretrained("laskar-ks/alcyone-v0")
gen = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(gen("Once upon a time, there was a little",
max_new_tokens=80,
do_sample=True,
temperature=0.8)[0]["generated_text"])
Roadmap
v0— this checkpoint. Proof that the full pretraining loop works end-to-end.v1— same architecture, longer training (≥5,000 steps on the full TinyStories train split). Expected: noticeably more coherent stories.v0-id— Bahasa Indonesia variant. Custom BPE tokenizer on an Indonesian corpus, same architecture.
About the name
Alcyone (η Tauri) is the brightest star in the Pleiades open star cluster. Part of a star-named model family alongside other projects (Parallax, Altair, Pleiades agents, etc.).
Author
Trained by Laskar as part of an AI engineering portfolio exploring agentic systems, multi-agent architectures, and foundational ML.
- Downloads last month
- 34