MariChatmen 2B Experimental

MariChatmen 2B Experimental is a LoRA/PEFT adapter for Qwen/Qwen3.5-2B-Base. It was trained locally on 2026-05-13 as a recovery run after the original 2B experiment failed its behavioural gate and no usable 2B artifact was available.

This is an experimental checkpoint. The current demo should prefer the 4B adapter (MariChatmen/MariChatmen-4B-Experimental) when hardware allows it.

Intended Use

The adapter is intended for Spanish/Andaluh chat experiments around the fictional MariChatmen assistant persona. It is not a general production assistant and should not be used for high-stakes decisions.

Loading

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-2B-Base"
adapter_id = "MariChatmen/MariChatmen-2B-Experimental"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base = AutoModelForCausalLM.from_pretrained(base_id, device_map="auto")
model = PeftModel.from_pretrained(base, adapter_id)

Training Data

The local recovery mix contained 22,858 SFT training rows and 1,134 validation rows. It combined:

  • a broad local Andaluh SFT mix derived from Spanish SFT data transformed with andalugeeks/andaluh-py;
  • oversampled MariChatmen project-authored repair anchors covering identity, style, safety, and instruction-following regressions.

The mixed training dataset is not uploaded with this model. The broad SFT portion includes downloaded rows transformed with andaluh-py, so it should not be republished as MariChatmen proprietary/project data. Uploadable project data is tracked separately in MariChatmen/MariChatmen-Project-Data.

Credits and Copyright

  • Base model: Qwen/Qwen3.5-2B-Base.
  • Fine-tuning framework: Hugging Face Transformers, TRL, PEFT, and PyTorch.
  • Transliteration / Andaluh transformation tooling: andalugeeks/andaluh-py.
  • Broad Spanish SFT sources recorded in the local row metadata include VillanovaAI/villanova-sft-2603 and upstream sources such as CohereLabs/aya_collection; original dataset licenses and attribution requirements remain with those sources.
  • MariChatmen repair anchors are project-authored/curated material for this project and are documented in the project data repository.

Training Procedure

  • Stage: supervised fine-tuning.
  • Base model: Qwen/Qwen3.5-2B-Base.
  • Tokenizer source: recovered MariChatmen 4B checkpoint tokenizer.
  • Sequence length: 384.
  • Prompt token cap: 256.
  • Max steps: 600.
  • LoRA rank: 16.
  • LoRA alpha: 32.
  • LoRA dropout: 0.05.
  • Learning rate: 5e-5.
  • Gradient accumulation: 16.
  • Embeddings resized and trained to match the MariChatmen tokenizer.
  • Hardware: local NVIDIA RTX 5060 Laptop GPU, 8 GB VRAM.

Evaluation Snapshot

The selected checkpoint is step 600, which was also the best checkpoint by validation loss.

  • Final validation loss: 2.2429933547973633.
  • Final validation mean token accuracy: 0.5876548955053303.
  • Training runtime: approximately 7,633 seconds.
  • Generation probes showed usable instruction following and safety refusals, with remaining roughness on some style and technical prompts.

Limitations

The model is a LoRA adapter, not a merged full model. Quality is expected to be below the recovered 4B adapter, and the Andaluh style can be uneven. Outputs may contain linguistic artifacts from automatic transformation and should be reviewed before publication.

Framework Versions

  • PEFT 0.19.1
  • TRL 1.3.0
  • Transformers 5.8.0.dev0
  • PyTorch 2.11.0+cu130
  • Datasets 4.8.5
  • Tokenizers 0.22.2
Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MariChatmen/MariChatmen-2B-Experimental

Adapter
(5)
this model

Dataset used to train MariChatmen/MariChatmen-2B-Experimental