Instructions to use ruslanmv/Matrix-BIOS-Italo-0.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ruslanmv/Matrix-BIOS-Italo-0.1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ruslanmv/Matrix-BIOS-Italo-0.1", trust_remote_code=True)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("ruslanmv/Matrix-BIOS-Italo-0.1", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use ruslanmv/Matrix-BIOS-Italo-0.1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ruslanmv/Matrix-BIOS-Italo-0.1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ruslanmv/Matrix-BIOS-Italo-0.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/ruslanmv/Matrix-BIOS-Italo-0.1

SGLang

How to use ruslanmv/Matrix-BIOS-Italo-0.1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ruslanmv/Matrix-BIOS-Italo-0.1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ruslanmv/Matrix-BIOS-Italo-0.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ruslanmv/Matrix-BIOS-Italo-0.1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ruslanmv/Matrix-BIOS-Italo-0.1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use ruslanmv/Matrix-BIOS-Italo-0.1 with Docker Model Runner:
```
docker model run hf.co/ruslanmv/Matrix-BIOS-Italo-0.1
```

MATRIX BIOS · Italo

The governed cognitive substrate for enterprise AI.

Matrix-BIOS-Italo-0.1

Developer: Agent-Matrix · Version: 0.1 · Language: Italian · License: Apache-2.0

Italo is the Italian language component of the Matrix BIOS family — a line of compact, governed, on-premise-ready cognitive models. It is built for organisations that need an Italian-native generator they can run inside their own perimeter: no data egress, predictable cost, full control.

Model overview

Architecture: compact causal Transformer (custom; loads via trust_remote_code).
Optimised for: low-latency, CPU/edge-friendly, on-premise and air-gapped deployment.
Position in the stack: the language organ of the Matrix BIOS cognitive substrate, orchestrated and governed by Matrix OS.

Intended use

Primary use cases

Italian text generation and drafting assistance within a controlled application.
A sovereign, self-hosted building block for enterprise AI workflows.
Research and integration into the Agent-Matrix ecosystem.

Out of scope

Unsupervised, high-stakes, or safety-critical text generation.
Use as a general-purpose assistant or a source of factual ground truth.

How to use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "ruslanmv/Matrix-BIOS-Italo-0.1", trust_remote_code=True)

Limitations & responsible use

This is a v0.1 early-access release: a compact model for integration and evaluation, not a turnkey production assistant. Generated text should be reviewed before downstream use and must not be relied upon for legal, medical, financial, or other consequential decisions without qualified human oversight.

Governance

Italo is designed to run under Matrix OS governance: actions that consume its output are gated by policy, budgeted, and auditable, with human authority retained over high-risk operations.

Citing this work

Matrix BIOS models implement the governed-memory architecture described in our paper. If you use them in research or production, please cite:

Magaña Vsevolodovna, R. I. (2026). Governed Memory: A Bio-Inspired, Governance-First Memory Architecture for Continual AI Systems (1.0). Zenodo. https://doi.org/10.5281/zenodo.20615572

@misc{magana2026governedmemory,
  title     = {Governed Memory: A Bio-Inspired, Governance-First Memory
               Architecture for Continual AI Systems},
  author    = {Maga{\~n}a Vsevolodovna, Ruslan Idelfonso},
  year      = {2026},
  publisher = {Zenodo},
  version   = {1.0},
  doi       = {10.5281/zenodo.20615572},
  url       = {https://doi.org/10.5281/zenodo.20615572}
}

The concept DOI 10.5281/zenodo.20615571 always resolves to the latest version.

License & contact

Downloads last month: -

Safetensors

Model size

41.5M params

Tensor type

F32