Instructions to use BananaMind/MicroStoryBananaMind-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use BananaMind/MicroStoryBananaMind-V1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="BananaMind/MicroStoryBananaMind-V1", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("BananaMind/MicroStoryBananaMind-V1", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use BananaMind/MicroStoryBananaMind-V1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "BananaMind/MicroStoryBananaMind-V1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BananaMind/MicroStoryBananaMind-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/BananaMind/MicroStoryBananaMind-V1
- SGLang
How to use BananaMind/MicroStoryBananaMind-V1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "BananaMind/MicroStoryBananaMind-V1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BananaMind/MicroStoryBananaMind-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "BananaMind/MicroStoryBananaMind-V1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "BananaMind/MicroStoryBananaMind-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use BananaMind/MicroStoryBananaMind-V1 with Docker Model Runner:
docker model run hf.co/BananaMind/MicroStoryBananaMind-V1
MicroStoryBananaMind-V1
So have you seen a model that is only 100KB and is still able to somehow generate english? Well this is it!
MicroStoryBananaMind-V1 is a tiny character-level story model trained on TinyStories-style text.
It uses a GRU:
- its only 500K parameters
- character-level vocabulary
- packed ternary weights
- Supports CPU NumPy inference through trust_remote_code
The model uses a Ternary format in the BitNet-like style to be extremely compact
Usage
Install dependencies:
pip install transformers numpy huggingface_hub
Load from Hugging Face:
from transformers import AutoModel
model = AutoModel.from_pretrained(
"BananaMind/MicroStoryBananaMind-V1",
trust_remote_code=True,
)
text = model.generate_text(
prompt="Once upon a time, there was a cat",
max_new_tokens=500,
temperature=0.65,
top_k=25,
repetition_penalty=1.08,
seed=42,
)
print(text)
Local usage:
from transformers import AutoModel
model = AutoModel.from_pretrained(
"./MicroStoryBananaMind-V1",
trust_remote_code=True,
)
print(model.generate_text(
prompt="Once upon a time",
max_new_tokens=300,
temperature=0.65,
top_k=25,
))
Model details
This is a compact experimental model
The model is character-level, so it generates one character at a time.
Samples
Once upon a time, there was a cats sad. He said she saw the dinner and smiling. She said, "Yes is a strong. I snats the big sunstafe.
The squirrow had learned that it shapes and said goodbyes and did not sunsund. They said he smiled and said, "Yes!
Once upon a time there was a little girl named Timmy and said no a sunseard toget
Once upon a time there was a shiny little girl named Misinn.
"What is that?" she says.
"His big story is dinner lickly to snaugh the string. The dragon smiled nead that the bird bitef slides some sunshine. She said "Thank you, thank you, Lily. You have some new sunself. I didn't see that shaters is not here so sp
- Once upon a time, there was a little boy named Timmy, Timmy was playing with his favorite toy that it was many funny. The mom said so snowed that she stopped that showes be smiles. He said, "Ghank!"
Lily smiled and said, "This disalshable slishad was scared.
Herself that day! I should share her eyes and stay some trass to see what shrong that is surprise!"
The sun was scared to be more n
To compare, the banner image of this model card is 1.2MB, and the model is 100KB thats 12x smaller than a single image.
Though the Activations, hidden state, gate math, scales, biases, and accumulators are FP32.
- Downloads last month
- -
