Instructions to use antoninbrthn/skill-neologisms-qwen2.5-digitseq-base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use antoninbrthn/skill-neologisms-qwen2.5-digitseq-base with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B") model = PeftModel.from_pretrained(base_model, "antoninbrthn/skill-neologisms-qwen2.5-digitseq-base") - Transformers
How to use antoninbrthn/skill-neologisms-qwen2.5-digitseq-base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="antoninbrthn/skill-neologisms-qwen2.5-digitseq-base") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("antoninbrthn/skill-neologisms-qwen2.5-digitseq-base", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use antoninbrthn/skill-neologisms-qwen2.5-digitseq-base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "antoninbrthn/skill-neologisms-qwen2.5-digitseq-base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "antoninbrthn/skill-neologisms-qwen2.5-digitseq-base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/antoninbrthn/skill-neologisms-qwen2.5-digitseq-base
- SGLang
How to use antoninbrthn/skill-neologisms-qwen2.5-digitseq-base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "antoninbrthn/skill-neologisms-qwen2.5-digitseq-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "antoninbrthn/skill-neologisms-qwen2.5-digitseq-base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "antoninbrthn/skill-neologisms-qwen2.5-digitseq-base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "antoninbrthn/skill-neologisms-qwen2.5-digitseq-base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use antoninbrthn/skill-neologisms-qwen2.5-digitseq-base with Docker Model Runner:
docker model run hf.co/antoninbrthn/skill-neologisms-qwen2.5-digitseq-base
Model Card for Model ID
This is a saved checkpoint from fine-tuning a Qwen/Qwen2.5-0.5B model via LoRA for the digit-sequence transformation task the paper "Skill Neologisms: Towards Skill-based Continual Learning" (ICML 2026).
Model Details
Model Description
This checkpoint was fine-tuned on a synthetic digit-sequence transformation task (see paper for full details).
- Finetuned from model [optional]: Qwen/Qwen2.5-0.5B
Model Sources
- Repository: Official code for the paper "Skill Neologisms: Towards Skill-based Continual Learning"
- Paper: Skill Neologisms: Towards Skill-based Continual Learning
Uses
Please refer to the official repo for example use.
Training Details
The LoRA adapter is trained over two phases with the configuration below (please refer to the paper for more details).
| Parameter | Phase 1 | Phase 2 |
|---|---|---|
| Base Model | Qwen/Qwen2.5-0.5B | |
| PEFT Method | LoRA (r=32, α=32) | |
| Target Modules | q, k, v, o, gate, up, down | |
| Training Samples | 100,000 | 500,000 |
| Test Samples | 500 | 500 |
| Operations per Sample | 1 | 1–3 |
| Epochs | 3 | 3 |
| Batch Size | 64 | 64 |
| Learning Rate | 2e-4 | 2e-4 |
| Warmup Steps | 500 | 500 |
| Operations | [ASC], [DESC], [ADD], [SUB], [POL], [REV], [ID] |
|
| Sequence Lengths | 2, 3, 4, 6, 8 (held-out: 5, 7, 9) | |
| Held-out 3-op combinations | -- | 25% |
Technical Specifications
Hardware
This model was trained on a NVIDIA RTX 6000 GPU (48GB VRAM).
Citation
BibTeX:
@article{berthon2026skill,
title={Skill Neologisms: Towards Skill-based Continual Learning},
author={Berthon, Antonin and Astorga, Nicolas and van der Schaar, Mihaela},
journal={arXiv preprint arXiv:2605.04970},
year={2026}
}
Model Card Contact
Antonin Berthon (berthon/dot\antonin/at\gmail/dot\com)
Framework versions
- transformers 5.0.0rc1
- peft 0.17.1
- trl 0.26.1
- Downloads last month
- 212
Model tree for antoninbrthn/skill-neologisms-qwen2.5-digitseq-base
Base model
Qwen/Qwen2.5-0.5B