Instructions to use virtuous7373/Gemma-4-Harmonia-31B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use virtuous7373/Gemma-4-Harmonia-31B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="virtuous7373/Gemma-4-Harmonia-31B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("virtuous7373/Gemma-4-Harmonia-31B") model = AutoModelForImageTextToText.from_pretrained("virtuous7373/Gemma-4-Harmonia-31B") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use virtuous7373/Gemma-4-Harmonia-31B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "virtuous7373/Gemma-4-Harmonia-31B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "virtuous7373/Gemma-4-Harmonia-31B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/virtuous7373/Gemma-4-Harmonia-31B
- SGLang
How to use virtuous7373/Gemma-4-Harmonia-31B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "virtuous7373/Gemma-4-Harmonia-31B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "virtuous7373/Gemma-4-Harmonia-31B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "virtuous7373/Gemma-4-Harmonia-31B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "virtuous7373/Gemma-4-Harmonia-31B", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use virtuous7373/Gemma-4-Harmonia-31B with Docker Model Runner:
docker model run hf.co/virtuous7373/Gemma-4-Harmonia-31B
HARMONIA
Gemini Word Salad Initialization
Harmonious Synthesis
Harmonia is a high-dimensional 31-billion parameter merge of Gemma 4. By executing a meticulous three-phase fusion of seven elite foundation and specialized models, Harmonia demonstrates a targeted approach to deep neural consolidation, minimizing regression while amplifying unique capability boundaries.
Instead of simple linear blending, which often degrades logical coherence and dilutes nuanced behavior, Harmonia was sculpted using a combination of mathematical projections, covariance activation matching, and surgical synaptic pruning. The model appears pretty solid so far.
Multi-Stage Fusion Protocol
The lineage of Harmonia is constructed systematically, passing through three isolated mathematical states to layer capabilities cleanly.
Nullspace Coherence Mapping
To anchor base capabilities, the primary Gemma-4-31B-Base is combined with the analytically rigorous GarnetV2-31B. Utilizing low-rank Singular Value Decomposition (SVD), the specialized donor features are projected entirely onto the mathematical null-space of the base weights. This prevents the creative delta vectors from distorting essential core intelligence, producing the stable platform clever-basename.
Surgical Synaptic Gating
Next, our newly anchored base is layered with the highly independent cognitive engines MeroMero-31B and Gembrain-31B. We apply Context-Aware Binary Selection (CABS) to execute structured, localized parameter gating. By enforcing precise structural pruning ratios (retaining optimal synapses in 16:32 and 11:33 ratios), we weave complex creative reasoning directly into the core matrix without causing neural interference. The result is the highly expressive clever-intname.
Covariance Activation Matching
In the final harmonization phase, the expressive clever-intname is combined with the narrative mastery of Equinox-31B, the creative depth of Fabled-Gemma4, and our primary conversational core Ortenzya-The-Creative-Wordsmith. Using data-free covariance estimation via task vectors, ACTMat reconstructs layer-wise input activation properties, solving for optimal projection weights in activation space. This resolves semantic alignment anomalies and delivers the unified output model.
Methodological Innovations
Model Lineage & Ingredients
We extend our gratitude to the creators of the ancestral paths that intersect within Harmonia:
Merge Blueprint
The entire orchestration sequence is structured via a multi-stage MergeKit pipeline. Expand the block below to view the structural YAML recipes.
Show MergeKit Configuration
name: clever-basename
merge_method: nullspace
base_model: ./gemma-4-31B-base
models:
- model: ./Gemma4-GarnetV2-31B
parameters:
weight: 1.0
parameters:
protect_base: true
nr: 256
tokenizer:
source: base
chat_template: auto
dtype: float32
out_dtype: bfloat16
---
name: clever-intname
merge_method: cabs
base_model: ./clever-basename
models:
- model: ./clever-basename
- model: ./G4-MeroMero-31B-uncensored-heretic
parameters:
weight: 0.6
n_val: 16
m_val: 32
- model: ./Gemma-4-Gembrain-31B-heretic
parameters:
weight: 0.4
n_val: 11
m_val: 33
default_n_val: 8
default_m_val: 32
pruning_order:
- ./G4-MeroMero-31B-uncensored-heretic
- ./Gemma-4-Gembrain-31B-heretic
dtype: float32
out_dtype: bfloat16
tokenizer:
source: union
chat_template: auto
---
name: Harmonia
merge_method: actmat
base_model: ./gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic
models:
- model: ./gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic
- model: ./LatitudeGames-Equinox-31B
parameters:
weight: 1
- model: ./clever-intname
parameters:
weight: 1
- model: ./Fabled-Gemma4-31B
parameters:
weight: 1
parameters:
epsilon: 1e-6
tokenizer:
source: "union"
dtype: bfloat16
out_dtype: bfloat16
chat_template: auto
Symphony Contributors
I am grateful to the following individuals for their models, inspiration, and other contributions.:
And of course, every wonderful person on:
LocalLLaMAA big thanks to Gemini-3.5-flash for creating this README alongside the word salads found within it. A special acknowledgment is extended to Google DeepMind for their contribution of the Gemma-4 foundation family to the open-weight ecosystem, representing the structural cornerstone of this merge and its constituents.
- Downloads last month
- 37