Instructions to use MariChatmen/MariChatmen-2B-Experimental with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use MariChatmen/MariChatmen-2B-Experimental with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-2B-Base")
model = PeftModel.from_pretrained(base_model, "MariChatmen/MariChatmen-2B-Experimental")

Transformers

How to use MariChatmen/MariChatmen-2B-Experimental with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="MariChatmen/MariChatmen-2B-Experimental")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("MariChatmen/MariChatmen-2B-Experimental", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use MariChatmen/MariChatmen-2B-Experimental with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "MariChatmen/MariChatmen-2B-Experimental"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MariChatmen/MariChatmen-2B-Experimental",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/MariChatmen/MariChatmen-2B-Experimental

SGLang

How to use MariChatmen/MariChatmen-2B-Experimental with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "MariChatmen/MariChatmen-2B-Experimental" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MariChatmen/MariChatmen-2B-Experimental",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "MariChatmen/MariChatmen-2B-Experimental" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "MariChatmen/MariChatmen-2B-Experimental",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use MariChatmen/MariChatmen-2B-Experimental with Docker Model Runner:
```
docker model run hf.co/MariChatmen/MariChatmen-2B-Experimental
```

MariChatmen 2B Experimental

MariChatmen 2B Experimental is a LoRA/PEFT adapter for Qwen/Qwen3.5-2B-Base. It was trained locally on 2026-05-13 as a recovery run after the original 2B experiment failed its behavioural gate and no usable 2B artifact was available.

This is an experimental checkpoint. The current demo should prefer the 4B adapter (MariChatmen/MariChatmen-4B-Experimental) when hardware allows it.

Intended Use

The adapter is intended for Spanish/Andaluh chat experiments around the fictional MariChatmen assistant persona. It is not a general production assistant and should not be used for high-stakes decisions.

Loading

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_id = "Qwen/Qwen3.5-2B-Base"
adapter_id = "MariChatmen/MariChatmen-2B-Experimental"

tokenizer = AutoTokenizer.from_pretrained(adapter_id)
base = AutoModelForCausalLM.from_pretrained(base_id, device_map="auto")
model = PeftModel.from_pretrained(base, adapter_id)

Training Data

The local recovery mix contained 22,858 SFT training rows and 1,134 validation rows. It combined:

a broad local Andaluh SFT mix derived from Spanish SFT data transformed with andalugeeks/andaluh-py;
oversampled MariChatmen project-authored repair anchors covering identity, style, safety, and instruction-following regressions.

The mixed training dataset is not uploaded with this model. The broad SFT portion includes downloaded rows transformed with andaluh-py, so it should not be republished as MariChatmen proprietary/project data. Uploadable project data is tracked separately in MariChatmen/MariChatmen-Project-Data.

Credits and Copyright

Base model: Qwen/Qwen3.5-2B-Base.
Fine-tuning framework: Hugging Face Transformers, TRL, PEFT, and PyTorch.
Transliteration / Andaluh transformation tooling: andalugeeks/andaluh-py.
Broad Spanish SFT sources recorded in the local row metadata include VillanovaAI/villanova-sft-2603 and upstream sources such as CohereLabs/aya_collection; original dataset licenses and attribution requirements remain with those sources.
MariChatmen repair anchors are project-authored/curated material for this project and are documented in the project data repository.

Training Procedure

Stage: supervised fine-tuning.
Base model: Qwen/Qwen3.5-2B-Base.
Tokenizer source: recovered MariChatmen 4B checkpoint tokenizer.
Sequence length: 384.
Prompt token cap: 256.
Max steps: 600.
LoRA rank: 16.
LoRA alpha: 32.
LoRA dropout: 0.05.
Learning rate: 5e-5.
Gradient accumulation: 16.
Embeddings resized and trained to match the MariChatmen tokenizer.
Hardware: local NVIDIA RTX 5060 Laptop GPU, 8 GB VRAM.

Evaluation Snapshot

The selected checkpoint is step 600, which was also the best checkpoint by validation loss.

Final validation loss: 2.2429933547973633.
Final validation mean token accuracy: 0.5876548955053303.
Training runtime: approximately 7,633 seconds.
Generation probes showed usable instruction following and safety refusals, with remaining roughness on some style and technical prompts.

Limitations

The model is a LoRA adapter, not a merged full model. Quality is expected to be below the recovered 4B adapter, and the Andaluh style can be uneven. Outputs may contain linguistic artifacts from automatic transformation and should be reviewed before publication.

Framework Versions

PEFT 0.19.1
TRL 1.3.0
Transformers 5.8.0.dev0
PyTorch 2.11.0+cu130
Datasets 4.8.5
Tokenizers 0.22.2

Downloads last month: 25

Model tree for MariChatmen/MariChatmen-2B-Experimental

Base model

Qwen/Qwen3.5-2B-Base

Adapter

(5)

this model

MariChatmen
/

MariChatmen-2B-Experimental