Serve an LLM (vLLM) 🚧 not trained yet

Deploy a (fine-tuned) LLM for fast, batched, OpenAI-compatible serving.

Status — documented recipe (placeholder). A production-grade pipeline from Ropedia Academy for an advanced, GPU-heavy task. Everything below — base model, objective, dataset, config, the exact evaluation — is specified; the weights / metrics / figures land here automatically when you run the notebook on a GPU (one click below). Try the trained models live in the Ropedia demos Space.

At a glance


Base model	Any (fine-tuned) LLM you serve
Task	fast LLM serving / deployment
Training objective	High-throughput batched inference (PagedAttention) — no training.
Track	LM · Language & multimodal
Built on	vllm-project/vllm
Notebook
Compute / storage / time	GPU required — see the Compute · storage · time table in the notebook

Dataset

Source: n/a (serving).

Training config

GPU-scale — the notebook ships a demo profile (free Colab T4) and a full profile, with an exact Compute · storage · time table. Hyperparameters (optimizer, steps, batch, LoRA rank, …) are in the training cell.

Evaluation results

⏳ Pending — run the notebook on a GPU to fill this in. This lab reports throughput (tok/s) · latency on a held-out split (see its Evaluate cell).

Inference example

No weights are published yet. After a GPU run, load the checkpoint/adapter the notebook saves (it also has a ready inference cell). Base model: Any (fine-tuned) LLM you serve.

How to fill this repo

Open the notebook in Colab → Runtime → GPU → Run all (runs the real pipeline).
Run its Publish to the Hugging Face Hub step (or HfApi().upload_folder(...)) — the checkpoint + metrics.json + figures replace this placeholder.

Train / run on a GPU · [ ] upload weights · [ ] add metrics.json · [ ] add figures · [ ] swap in the real results card

Limitations

Not yet trained — no numbers to report. The pipeline is GPU-heavy (see the compute table); on free Colab use the demo-scale settings. This is an educational, reproducible recipe, not a tuned production release.

License

Code: MIT (this repository). The base model (vllm-project/vllm) and dataset are each under their own licenses — check the upstream source before redistribution.

Citation

@misc{ropedia_academy,
  title  = {Ropedia Academy: an interactive course on embodied & spatial AI},
  author = {Ropedia Academy},
  year   = {2026},
  howpublished = {\url{https://chaoyue0307.github.io/ropedia-academy/}}
}

Method / original work: Kwon et al., vLLM / PagedAttention, SOSP 2023.

Related assets

🚀 Live demos: https://huggingface.co/spaces/cy0307/ropedia-demos
🤗 All models + collection: https://huggingface.co/cy0307
📚 Course & all labs: https://chaoyue0307.github.io/ropedia-academy/ · Labs tab
💻 Source / notebooks: github.com/ChaoYue0307/ropedia-academy
🔗 Relates to tracks: A · B · C · D

Documented placeholder in the Ropedia Academy collection — train it on a GPU to publish the real model. Contributions welcome on GitHub.

Downloads last month: -; Downloads are not tracked for this model. How to track

Collection including cy0307/lm-vllm-serving

Ropedia Academy — trained models

Collection

45 models · embodied & spatial AI: human motion, 3D rendering, egocentric vision, world models, LLMs & agents. Trained in Ropedia Academy. • 45 items • Updated about 14 hours ago • 1