Instructions to use North-ML1/aurora-one with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use North-ML1/aurora-one with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="North-ML1/aurora-one")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("North-ML1/aurora-one", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use North-ML1/aurora-one with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "North-ML1/aurora-one" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "North-ML1/aurora-one", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/North-ML1/aurora-one
- SGLang
How to use North-ML1/aurora-one with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "North-ML1/aurora-one" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "North-ML1/aurora-one", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "North-ML1/aurora-one" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "North-ML1/aurora-one", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use North-ML1/aurora-one with Docker Model Runner:
docker model run hf.co/North-ML1/aurora-one
Aurora One
Aurora One is a compact experimental chat model designed for general conversation, lightweight assistance, and research.
Smartness
Aurora One is designed to:
- hold basic conversations
- answer straightforward questions
- explain simple concepts
- help with brainstorming and writing
- follow short instructions
- generate short stories and responses
- assist with basic coding and technical questions
Aurora One is still experimental. It may make factual mistakes, misunderstand complex instructions, repeat itself, or produce incorrect reasoning.
Intended Uses
Aurora One is intended for:
- general-purpose chat
- lightweight personal assistants
- educational experiments
- writing and brainstorming
- simple question answering
- basic coding assistance
- chatbot prototypes
- small-model research
- local and low-resource inference
It should not be relied on for medical, legal, financial, or other high-stakes decisions.
Model Parameters
| Property | Value |
|---|---|
| Parameters | 119,953,152 |
| Architecture | Llama-style causal transformer |
| Transformer layers | 14 |
| Hidden size | 768 |
| Attention heads | 12 |
| Key-value heads | 12 |
| MLP intermediate size | 2,304 |
Vocabulary
Aurora One uses a custom tokenizer with a vocabulary size of:
16,384 tokens
Context Length
Aurora One supports a maximum context length of:
2,048 tokens
