Instructions to use ethicalabs/Kurtis-EON1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ethicalabs/Kurtis-EON1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ethicalabs/Kurtis-EON1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ethicalabs/Kurtis-EON1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ethicalabs/Kurtis-EON1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ethicalabs/Kurtis-EON1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ethicalabs/Kurtis-EON1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ethicalabs/Kurtis-EON1
- SGLang
How to use ethicalabs/Kurtis-EON1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ethicalabs/Kurtis-EON1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ethicalabs/Kurtis-EON1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ethicalabs/Kurtis-EON1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ethicalabs/Kurtis-EON1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ethicalabs/Kurtis-EON1 with Docker Model Runner:
docker model run hf.co/ethicalabs/Kurtis-EON1
A quick update on the development of Kurtis-EON1 (Echo-DSRN)
I am currently finalizing the pre-print paper detailing the O(1) memory footprint, the "infinite" context extrapolation, and our training curriculum on AMD hardware (MI300x/Strix Halo).
because I want to ensure absolute transparency, I am releasing a "Work in Progress" draft of the paper for early peer review within my network before the final arXiv submission.
I also want to make a crucial architectural clarification regarding the model arch. Echo-DSRN is not a derivative of Google Titans architecture.
This is a continuation of RNN experiments I began back in 2016 for pure text generation and the foundational architecture has been in the works long before the current linear-RNN renaissance. The base model pre-training framework is based on a 10+ years old codebase, re-written in pure PyTorch.
Where Google's Titans comes into play is strictly as an inspiration for the surprise-based Gating mechanism.
Their research provided a highly elegant framing for using auto-predictive error (Surprise) to gate memory updates, which we integrated into our existing dual-state recurrent blocks to prevent the deep state from being vanished.
Echo-DSRN is a native PyTorch, O(1) recurrent engine built from the ground up. The goal is to build something that can run entirely on consumer edge devices without relying on massive GPU clusters or the Transformer's KV-cache.
The base weights remain withheld while we complete the SFT/DPO alignment phases, but I will be sharing the draft paper with select researchers shortly.
Thank you to everyone!!
I removed references to Google Titans to prevent IP misattribution.
Echo-DSRN is an independent dual-state recurrent architecture tracing back to 2016 foundational experiments, not a Google derivative or wrapper.
Titans (surprise-gating), xLSTM (parallel scan), and Hymba (RMSNorm) will be formally cited as mechanistic inspirations and related works in the upcoming arXiv pre-print.
This ensures accurate community tracking of the base model's lineage.