Paterikon-SFT-v2

Three-pass iterative SFT — the full instruction-tuned Patristic AI.


Overview

Paterikon-SFT-v2 is the culmination of the Orthodox Patristic AI fine-tuning pipeline. Built on Paterikon-3B, it underwent three iterative rounds of supervised fine-tuning with active learning: the model generates responses, a judge evaluates them, and the best pairs are fed back into the next training round.

This active-loop approach produced 3,287 high-quality Q&A pairs across three iterations, progressively refining the model's theological accuracy, pastoral tone, and resistance to confabulation.

Base model jayfurzy/paterikon-3b (Qwen2.5-3B CPT)
Training 3-pass iterative SFT with active learning (TRL)
Parameters 3.09 billion
Architecture Qwen2ForCausalLM, 36 layers, 2048 hidden
Languages English (primary), Russian, Greek
Training pairs 3,287 (1,156 iter-1 + 1,089 iter-2 + 1,042 iter-3)
Training probes 75M tokens across 3 iterations (25M per iter)
Frameworks TRL 0.28.0, Transformers 4.57.6, PyTorch 2.9.1
License Apache 2.0

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "jayfurzy/paterikon-sft-v2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are an Orthodox Christian theologian, answering in the spirit of the Holy Fathers."},
    {"role": "user", "content": "Explain the Orthodox understanding of theosis."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Methodology

Active Learning Loop

Each iteration follows the same cycle:

  1. Probe: Generate responses to a diverse set of Orthodox theological questions (25M tokens)
  2. Judge: Score responses on theological accuracy, pastoral tone, and attribution fidelity
  3. Select: Filter for high-scoring Q&A pairs
  4. Train: SFT fine-tune on the selected pairs — producing the next iteration's model

This is repeated 3 times, with each iteration building on the improved model from the previous round.

Iteration Breakdown

Iteration Generated Pairs Training Data
Iter 1 1,156 iter1_probe.jsonl (25M tokens)
Iter 2 1,089 iter2_probe.jsonl (25M tokens)
Iter 3 1,042 iter3_probe.jsonl (25M tokens)
Final 3,287 total SFT on iter-3 checkpoint

Intended Use

  • Orthodox theological Q&A with pastoral voice
  • Spiritual guidance and counseling
  • Patristic text generation and completion
  • Educational tool for Orthodox Christian studies
  • Church Father quote recall and attribution
  • Base for further alignment (DPO/RLHF)

Limitations

  • May still confabulate on specific historical dates, canons, or exact citations
  • Not a substitute for a spiritual father, priest, or bishop
  • Reflects synthetic data biases — primarily Russian Orthodox sources
  • Temperature > 0.7 recommended for pastoral responses; > 0.8 may drift theologically
  • Model is 3B parameters — limited compared to larger models for complex theological reasoning

Relationship to Other Models

Qwen2.5-3B-Instruct
  └─ CPT → jayfurzy/paterikon-3b (patristic domain fluency)
       ├─ SFT → jayfurzy/paterikon-sft-v1 (single-pass instruction tuning)
       └─ 3-pass active-loop → jayfurzy/paterikon-sft-v2 (this model)
            └─ DPO planned

Citation

@misc{paterikon-sft-v2,
  author = {Justin Fursov},
  title = {Paterikon-SFT-v2: Three-Pass Iterative Instruction-Tuned Orthodox Patristic Language Model},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/jayfurzy/paterikon-sft-v2},
}

This model was developed as part of the Orthodox Constitution Project — extracting ethical principles from the Church Fathers for AI alignment. The name "Paterikon" (πατερικόν) refers to the traditional Orthodox collections of the sayings and lives of the Desert Fathers.

Downloads last month
10
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jayfurzy/paterikon-sft-v2

Base model

Qwen/Qwen2.5-3B
Finetuned
(3)
this model

Dataset used to train jayfurzy/paterikon-sft-v2