VectraYX-Nano
A 42M-parameter Spanish cybersecurity language model trained from scratch with curriculum learning and native MCP tool use.
Key Results (VectraYX-Bench)
| Model | Params | B1 KW | B2 F1 | B3 TM | B4 Tool | B5 |
|---|---|---|---|---|---|---|
| VectraYX-Nano v2 (N=4 seeds) | 42M | 0.228 ± 0.079 | 0.196 ± 0.005 | 0.029 ± 0.040 | 0.000 | 0.775 ± 0.050 |
| Nano + LoRA mini (N=4 seeds) | 42M | 0.011 ± 0.004 | 0.201 ± 0.002 | 0.021 ± 0.012 | 0.145 ± 0.046 | 0.575 ± 0.043 |
| VectraYX-Base 260M | 260M | 0.325 | 0.220 | 0.114 | 0.000 | 0.800 |
| Base + LoRA mini | 260M | 0.025 | 0.200 | 0.000 | 0.580 | 0.600 |
| VectraYX-Pro 3B | 3.2B | 0.341 | 0.695 | 0.686 | 0.600 | 0.800 |
| VectraYX-Pro 7B | 7B | 0.335 | 0.815 | 0.686 | 0.880 | 0.800 |
Key Finding
The B4=0.000 floor in mixed SFT is a corpus-density artifact, not a capacity gate. At ratio 1:21 (2,801 tool-use examples), Nano 42M achieves B4=0.145 ± 0.046 and Base 260M achieves B4=0.580.
Usage
# Load with custom inference script
# See: https://huggingface.co/vectrayx/vectrayx-paper-code
from huggingface_hub import hf_hub_download
import torch
# Download checkpoint
ckpt_path = hf_hub_download("vectrayx/vectrayx-nano", "nano_sft_v5.pt")
tokenizer_path = hf_hub_download("vectrayx/vectrayx-nano", "tokenizer/vectrayx_bpe.model")
config_path = hf_hub_download("vectrayx/vectrayx-nano", "configs/nano.json")
Citation
@inproceedings{santillana2026vectrayx,
title = {VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model
with Curriculum Learning and Native Tool Use},
author = {Santillana, Juan S.},
booktitle = {Preprint},
year = {2026}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support