BananaMind Completor V1

BananaMind Completor V1 is a small byte-level GPT-2 causal language model for autocomplete and next-character prediction.

This is a native Hugging Face Transformers model using GPT2LMHeadModel, saved as model.safetensors. It predicts UTF-8 bytes directly with token ids 0..255, so it does not use a normal BPE tokenizer.

License

BananaMind Completor V1 model code and released weights are provided under the Apache License 2.0. See LICENSE.

The training dataset has separate upstream terms. The original ODC-BY 1.0 license text for agentlans/high-quality-english-sentences is included at THIRD_PARTY_LICENSES/ODC-BY-1.0.txt, with attribution in NOTICE.

Model Details

  • Architecture: GPT2LMHeadModel
  • Parameters: 1,927,936
  • Vocabulary: 256 byte values
  • Context length: 128 bytes
  • Embedding size: 128
  • Attention heads: 4
  • Transformer layers: 4
  • MLP inner size: 1536
  • Input/output embeddings: untied
  • Checkpoint source: autocomplete_model_gpt2_best.pt
  • Release weights: model.safetensors

Training Data Attribution

This model was trained on agentlans/high-quality-english-sentences, using the train split for training and the test split for validation.

Dataset attribution: agentlans/high-quality-english-sentences.

Dataset license: ODC-BY 1.0. Follow the dataset license terms when reusing the training data or distributing derived artifacts that require attribution. This release includes the original ODC-BY 1.0 license text in THIRD_PARTY_LICENSES/ODC-BY-1.0.txt.

Evaluation

The released checkpoint is the best validation checkpoint from local GPT-2 training:

Metric Value
Training step 70000
Training loss 1.193311095237732
Validation loss 1.1782804751396179
Best validation loss 1.1782804751396179

Usage

Install dependencies:

pip install -r requirements.txt

Load the model with Transformers:

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(".")

Because this model uses raw byte ids, encode text manually:

import torch
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(".")
model.eval()

text = "The weather is beau"
tokens = list(text.encode("utf-8"))

for _ in range(20):
    x = torch.tensor([tokens[-128:]], dtype=torch.long)
    with torch.no_grad():
        next_id = model(input_ids=x).logits[0, -1].argmax().item()
    if chr(next_id) in " \n\t.,!?;:":
        break
    tokens.append(next_id)

print(bytes(tokens).decode("utf-8", errors="ignore"))

Or use the included helper:

python inference.py "The weather is beau"

Files

  • model.safetensors: native GPT-2 model weights
  • config.json: Transformers GPT-2 config
  • generation_config.json: Transformers generation config
  • inference.py: byte-level autocomplete helper
  • training_metadata.json: checkpoint, dataset, and evaluation metadata
  • requirements.txt: minimal Python dependencies
  • LICENSE: Apache License 2.0 for this model release
  • NOTICE: upstream dataset attribution notice
  • THIRD_PARTY_LICENSES/ODC-BY-1.0.txt: original ODC-BY 1.0 license text for the training dataset

Limitations

This is a small autocomplete model trained for short English sentence contexts. It is not instruction-tuned, not a general chat model, and may produce incomplete, repetitive, or low-quality completions outside short autocomplete-style prompts.

Downloads last month
195
Safetensors
Model size
1.93M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Banaxi-Tech/BananaMind-Completor-V1

Evaluation results