Gemma-4-E2B Agentic AI on AWS (Continued Pretraining)

This model is a fine-tuned version of Google's Gemma-4-E2B, explicitly adapted using Continued Pretraining (CPT). It has been trained to deeply understand the architectures, protocols, frameworks, and deployment strategies of Agentic AI systems on Amazon Web Services (AWS).

The model was adapted using Unsloth, leveraging Rank-Stabilized LoRA (rsLoRA) on both the attention/MLP layers and the core vocabulary embeddings to maximize domain adaptation.

Model Details

  • Base Model: google/gemma-4-e2b (via unsloth/gemma-4-E2B)
  • Training Type: Continued Pretraining (CPT) / Next-Token Prediction
  • Domain focus: AWS Architecture, Agentic AI, Frameworks, and Protocols (MCP, etc.)
  • Language: English
  • Library: Unsloth / Hugging Face Transformers

Dataset

The model was trained on specialized architectural literature, specifically sourced from the AWS Prescriptive Guidance: Agentic AI frameworks, platforms, protocols, and tools on AWS (92 pages). The dataset consists of high-quality architectural documentation, best practices, and protocol standards.

Training Configuration

Because this model underwent Continued Pretraining (to inject raw domain knowledge) rather than just behavioral Instruction Tuning, the embeddings and language modeling head were actively fine-tuned.

  • Method: PEFT / LoRA (Rank-Stabilized)
  • LoRA Rank (r): 64
  • LoRA Alpha: 16
  • Target Modules: Vision layers disabled. Attention, MLP, embed_tokens, and lm_head modules deeply fine-tuned.
  • Precision: 4-bit quantization (QLoRA) during training
  • Optimizer: Paged AdamW 8-bit

How to Use

Since this is a Base/CPT model rather than a strict Chat/Instruct model, it excels at text continuation, documentation generation, and architectural drafting.

You can load it using Transformers or Unsloth:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Capitaller/gemma_4E2B_finetune"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

prompt = "When designing Agentic AI architectures on AWS using the Model Context Protocol (MCP), it is highly recommended to"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Prompting Tips

Because the model acts as a highly knowledgeable autocomplete engine for AWS Agentic AI, frame your prompts as the beginning of a technical document or architectural guide rather than a question.

  • Avoid: "How do I use MCP on AWS?"
  • Do this instead: "### Guide to implementing the Model Context Protocol (MCP) on AWS\n\nThe most effective way to deploy MCP on AWS involves"
Downloads last month
63
GGUF
Model size
5B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support