Instructions to use fabsssss/qwen3-coder-30b-a3b-ies4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use fabsssss/qwen3-coder-30b-a3b-ies4 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir qwen3-coder-30b-a3b-ies4 fabsssss/qwen3-coder-30b-a3b-ies4
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Qwen3-Coder-30B-A3B — IES4 Turtle Generation (research prototype)
A LoRA fine-tune of mlx-community/Qwen3-Coder-30B-A3B-Instruct-8bit that generates IES4 (UK Government Information Exchange Standard) RDF/Turtle from natural-language scenarios, and follows a supplied target ontology for general knowledge-graph extraction. To our knowledge this is the first openly published LLM fine-tune targeting IES4 (checked against the Hugging Face API, GitHub and arXiv at release time). It is a research prototype: validate all output before production use.
IES4 is a 4D ontology specified as an RDF Schema, developed by UK Government (Dstl, MOD, Home Office, Metropolitan Police, HMRC, DBT) with technical support from Telicent and Aurora Consulting. Repo: dstl/IES4.
Why fine-tune at all?
The untuned base model cannot produce real IES4: 93.7% of the ies: terms it emits do not exist in the ontology (0% term conformance). After LoRA:
| Metric (held-out, in-distribution) | Base model | This model |
|---|---|---|
| Syntactic validity | 93.2% | 95.5% |
| IES4 term conformance | 0.0% | 88.6% |
| Hallucinated-term rate | 0.937 | 0.010 |
| Structural conformance (domain/range) | 0.932* | 0.955 |
| Namespace fidelity (when instructed) | — | 100% |
| Out-of-distribution (real dstl sample-data scenarios) | Base | This model |
|---|---|---|
| Syntactic validity | 90.0% | 70.0% |
| IES4 term conformance | 0.0% | 30.0% |
| Structural conformance | 0.900* | 0.640 |
| Ontology-conditioned extraction (Text2KGBench slice) | Base | This model |
|---|---|---|
| Syntactic validity | 50.0% | 91.7% |
| Relation conformance | 75.0% | 91.7% |
| IES-vocabulary bleed | 0% | 0% |
* Baseline structural numbers are inflated: with mostly hallucinated vocabulary there are few checkable property usages. Metrics follow the spirit of Text2KGBench: validity, conformance, hallucination. Eval code ships with the dataset repo; the OOD row is deliberately reported although it is the model's weakest surface.
Training data (correct-by-construction)
- 1,589 IES pairs: graphs built programmatically with telicent-ies-tool across 14 scenario patterns (employment, birth/death, events, identifiers, ownership, posts, location-states, access, possession, communication, composites), human-plausible instance IRIs, 35% namespace-varied with explicit namespace instructions. Every graph passed BOTH the telicent validation AND an independent term-membership validator built from the published ontology (510 classes, 204 properties). Descriptions are deterministic plus fact-checked local-LLM paraphrases (paraphrases dropping any name or year were discarded).
- 210 vocabulary/boundary pairs: class and property definitions verbatim from the ontology, plus refusal examples teaching what IES4 cannot express (opinions, speculation, causal claims).
- 448 ontology-conditioned extraction pairs from Text2KGBench (Wikidata-TekGen), predicates restricted to each domain ontology.
Split by target graph (no paraphrase leakage); OOD test set = descriptions of the real dstl sample-data files, never trained on.
Method
LoRA (16 layers) via mlx-lm 0.31.3 on Apple Silicon (M3 Max), QLoRA on the 8-bit MoE
base, 1,000 iterations, batch 2, seq 2048, final val loss 0.15. The repo contains the
fused 8-bit MLX model; the raw adapter is in adapters/ for applying to the bf16
base with other toolchains.
Usage (MLX)
pip install mlx-lm
python -m mlx_lm generate --model fabsssss/qwen3-coder-30b-a3b-ies4 --max-tokens 600 --prompt \
"Encode the following scenario as IES4 RDF/Turtle. Use only real IES4 terms and the
4D state/period pattern where relevant. Output only Turtle.
Scenario: Priya Patel has worked for Meridian Bank since 2019-03-01 and attended a
security briefing at Heathrow Terminal 4 on 2024-05-02 from 09:00 to 11:00."
Limitations
- Out-of-distribution performance (rich, idiomatic IES exchanges) is markedly lower than in-distribution; treat complex outputs as drafts for expert review.
- Coverage: measures, representation/document patterns and intelligence-assessment structures are under-represented.
- MLX 8-bit format; GGUF conversion not yet provided. Use the adapter on the bf16 base if you need other runtimes.
- Always validate output (e.g. with telicent-ies-tool or the shipped validator) before exchange.
Training data licensing & attribution
- IES4 ontology: MIT, © Crown copyright, Defence Science and Technology Laboratory (Dstl). This model card retains that notice.
- telicent-ies-tool: used only to generate training graphs (library not redistributed).
- ~448 training pairs derive from Text2KGBench (data licence CC BY-SA 4.0; sources Wikidata-TekGen / DBpedia-WebNLG). Attribution: "Data derived from Text2KGBench (Mihindukulasooriya, Tiwari, Enguix, Lata; ISWC 2023), licensed CC BY-SA 4.0." The published dataset repo marks that slice separately under CC BY-SA 4.0.
- Model weights: MIT. Weight releases are not, on current consensus, derivative works of training data; attribution obligations above are honoured regardless.
Provenance
Built and adversarially red-teamed (dataset design, eval integrity, licensing, tooling) before release; the eval harness and dataset are published for reproduction. By The Tesseract Academy.
- Downloads last month
- -
8-bit
Model tree for fabsssss/qwen3-coder-30b-a3b-ies4
Base model
Qwen/Qwen3-Coder-30B-A3B-Instruct