Instructions to use BananaMind/MicroStoryBananaMind-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use BananaMind/MicroStoryBananaMind-V1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="BananaMind/MicroStoryBananaMind-V1", trust_remote_code=True)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("BananaMind/MicroStoryBananaMind-V1", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use BananaMind/MicroStoryBananaMind-V1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "BananaMind/MicroStoryBananaMind-V1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BananaMind/MicroStoryBananaMind-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/BananaMind/MicroStoryBananaMind-V1

SGLang

How to use BananaMind/MicroStoryBananaMind-V1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "BananaMind/MicroStoryBananaMind-V1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BananaMind/MicroStoryBananaMind-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "BananaMind/MicroStoryBananaMind-V1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "BananaMind/MicroStoryBananaMind-V1",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use BananaMind/MicroStoryBananaMind-V1 with Docker Model Runner:
```
docker model run hf.co/BananaMind/MicroStoryBananaMind-V1
```

MicroStoryBananaMind-V1

So have you seen a model that is only 100KB and is still able to somehow generate english? Well this is it!

MicroStoryBananaMind-V1 is a tiny character-level story model trained on TinyStories-style text.

It uses a GRU:

its only 500K parameters
character-level vocabulary
packed ternary weights
Supports CPU NumPy inference through trust_remote_code

The model uses a Ternary format in the BitNet-like style to be extremely compact

Usage

Install dependencies:

pip install transformers numpy huggingface_hub

Load from Hugging Face:

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "BananaMind/MicroStoryBananaMind-V1",
    trust_remote_code=True,
)

text = model.generate_text(
    prompt="Once upon a time, there was a cat",
    max_new_tokens=500,
    temperature=0.65,
    top_k=25,
    repetition_penalty=1.08,
    seed=42,
)

print(text)

Local usage:

from transformers import AutoModel

model = AutoModel.from_pretrained(
    "./MicroStoryBananaMind-V1",
    trust_remote_code=True,
)

print(model.generate_text(
    prompt="Once upon a time",
    max_new_tokens=300,
    temperature=0.65,
    top_k=25,
))

Model details

This is a compact experimental model

The model is character-level, so it generates one character at a time.

Samples

Once upon a time, there was a cats sad. He said she saw the dinner and smiling. She said, "Yes is a strong. I snats the big sunstafe.

The squirrow had learned that it shapes and said goodbyes and did not sunsund. They said he smiled and said, "Yes!

Once upon a time there was a little girl named Timmy and said no a sunseard toget

Once upon a time there was a shiny little girl named Misinn.

"What is that?" she says.

"His big story is dinner lickly to snaugh the string. The dragon smiled nead that the bird bitef slides some sunshine. She said "Thank you, thank you, Lily. You have some new sunself. I didn't see that shaters is not here so sp

Once upon a time, there was a little boy named Timmy, Timmy was playing with his favorite toy that it was many funny. The mom said so snowed that she stopped that showes be smiles. He said, "Ghank!"

Lily smiled and said, "This disalshable slishad was scared.

Herself that day! I should share her eyes and stay some trass to see what shrong that is surprise!"

The sun was scared to be more n

To compare, the banner image of this model card is 1.2MB, and the model is 100KB thats 12x smaller than a single image.

Though the Activations, hidden state, gate math, scales, biases, and accumulators are FP32.

Downloads last month: -

BananaMind
/

MicroStoryBananaMind-V1