Instructions to use BananaMind/MiniBananaMind-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use BananaMind/MiniBananaMind-V1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="BananaMind/MiniBananaMind-V1")

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("BananaMind/MiniBananaMind-V1")
model = AutoModelForMultimodalLM.from_pretrained("BananaMind/MiniBananaMind-V1")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use BananaMind/MiniBananaMind-V1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "BananaMind/MiniBananaMind-V1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BananaMind/MiniBananaMind-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/BananaMind/MiniBananaMind-V1

SGLang

How to use BananaMind/MiniBananaMind-V1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "BananaMind/MiniBananaMind-V1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BananaMind/MiniBananaMind-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "BananaMind/MiniBananaMind-V1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BananaMind/MiniBananaMind-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use BananaMind/MiniBananaMind-V1 with Docker Model Runner:
```
docker model run hf.co/BananaMind/MiniBananaMind-V1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

MiniBananaMind-V1

MiniBananaMind-V1 is a compact LLaMA-style causal language model from BananaMind. It is trained from scratch for next-token text generation on streamed FineWeb-Edu data and is intended as a small, inspectable base model for experiments, demos, and lightweight research workflows.

This is a base language model, not an instruction-tuned assistant. It is best used for continuation-style generation and experimentation rather than factual question answering or chat.

Model Details

Developer: BananaMind
Model type: LLaMA-style causal language model
Library: Transformers
Task: Text generation
Training data: FineWeb-Edu
Checkpoint: MiniBananaMind-V1 uploaded training checkpoint
License: Apache 2.0

Architecture

Setting	Value
Layers	6
Hidden size	256
Attention heads	8
KV heads	8
Intermediate size	768
Context length	512 tokens
Vocabulary size	32,000
Parameters	~21.5M
Precision	float32 checkpoint

Intended Use

MiniBananaMind-V1 is suitable for:

Small-scale language-model experiments
Educational demos of decoder-only generation
Testing tokenization, generation settings, and inference pipelines
Research prototypes where a very small causal LM is useful

It is not recommended for production assistants, safety-critical use, or tasks that require reliable factual knowledge.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo_id = "BananaMind/MiniBananaMind-V1"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype=torch.float32,
    device_map="auto",
)

prompt = "A computer is a machine that"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=64,
        do_sample=True,
        temperature=0.2,
        top_p=0.9,
        repetition_penalty=1.1,
        pad_token_id=tokenizer.pad_token_id,
        eos_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(output[0], skip_special_tokens=True))

Generation Notes

Because this is a small base model, output quality depends heavily on prompt style and sampling settings. A temperature of 0.2 is recommended for more stable continuations. For more varied text, increase temperature or top_p.

Limitations

The model may hallucinate facts, names, citations, and dates.
It has not been instruction tuned or aligned for chat behavior.
It may reproduce biases or unsafe patterns present in web-scale training data.
The short 512-token context length limits long-document use.
Small model size means weaker reasoning and factual recall than larger LMs.

Training Data

MiniBananaMind-V1 was trained on streamed FineWeb-Edu text. FineWeb-Edu is a large educational-quality web corpus, so users should expect broad web-language coverage as well as the usual limitations of internet-scale data.

Training data attribution: this model was trained on FineWeb-Edu, a dataset released by Hugging Face as part of the FineWeb family.

Citation

If you use this model in a project, cite the Hugging Face repository and attribute the FineWeb-Edu training data:

@misc{minibananamindv1,
  title = {MiniBananaMind-V1},
  author = {BananaMind},
  year = {2026},
  howpublished = {\url{https://huggingface.co/BananaMind/MiniBananaMind-V1}}
}

Dataset: HuggingFaceFW/fineweb-edu

Downloads last month: 54

Safetensors

Model size

21.5M params

Tensor type

F32

Model tree for BananaMind/MiniBananaMind-V1

Quantizations

1 model

BananaMind
/

MiniBananaMind-V1