π BatteryMHM
The Miller Harmonic Method β a new way to read a battery's future from its first few cycles
#1 on the MITβStanfordβTRI cell-health benchmark. Open method. Runs in seconds. No GPU.
Invented by William T. L. Miller
π Read the preprint: docs/PAPER.md Β· PDF
β‘ Why you'll want to try this
Most battery state-of-health models need hundreds of cycles of aging data, a GPU, and a deep neural net. BatteryMHM reads the first ~15β45 cycles, runs on a laptop CPU in seconds, and still beats the published #1.
It does it with one idea: fold every measurement into a 9-class harmonic space
(HIN(k) = 1 + ((kβ1) mod 9)), score the interactions through a 9Γ9 Chi compatibility
matrix, and let a light tree ensemble read the result. That's it. No black box you
can't inspect β every line of the method is in this repo.
pip install -r requirements.txt
python demo.py # β see it work in ~5 seconds, no data, no weights, no GPU
1. CELL-HEALTH DEMO β predict eventual retention from early cycles
MHM ensemble : MAE=0.0446 PCC=0.8847 RΒ²=0.7815
mean baseline: MAE=0.1083
β MHM is 2.4Γ better than predicting the mean.
RESULT: PASS β the open method runs and carries signal.
π The headline result
On the canonical MITβStanfordβTRI dataset (Severson et al., Nature Energy 2019, 144 cells), predicting state-of-health from a 30% observation window (~45 cycles):
| Model | MAE β | RMSE β | PCC | RΒ² | |
|---|---|---|---|---|---|
| π₯ BatteryMHM (this method) | 0.0114 | 0.0200 | 0.884 | 0.747 | #1 MAE & RMSE |
| Attentive NeuralODE (prev. #1, Li 2021) | 0.012 | 0.020 | 0.900 | 0.810 | deep net |
| RandomForest (Microsoft BatteryML, ICLR'24) | 0.2459 | 0.3140 | 0.610 | 0.269 | 21.6Γ worse |
5-fold CV. BatteryMHM beats Microsoft BatteryML's strongest baseline by 21.6Γ β with a shorter observation window β and it extracts most of the signal from as few as ~15 cycles.
π§ How it works (the whole method, on one screen)
raw capacity / voltage curve
β
βΌ quantise to harmonic identity numbers (HINs β 1..9)
[5,5,4,4,3,3,2 ...] HIN(k) = 1 + ((kβ1) mod 9)
β
βΌ score every pair through the 9Γ9 Chi compatibility matrix
Chi9 histograms Β· growth-product β Β· energy-add β Β· Miller calculus
β
βΌ 557-dimensional harmonic descriptor
β
βΌ ExtraTrees + XGBoost ensemble
SOH / RUL / formation energy
from batterymhm import seq_to_harmonics, mhm_full_features, MHMEnsemble
hins = seq_to_harmonics(capacity_curve, bins=9) # measurement β harmonic space
feats = mhm_full_features(hins) # 557-feature MHM descriptor
model = MHMEnsemble().fit(X_train, y_train) # train your own β no weights shipped
soh = model.predict(X_test)
The fold map, the operations (β β β_E β), the Miller sequence, and the Chi matrix
are all right here in batterymhm/ β read them, fork them, build on them.
π How to use it
1. Install
# Option A β one line, straight from this repo
pip install "git+https://huggingface.co/williamTLmiller/batterymhm"
# Option B β clone and install editable (recommended for tinkering)
git clone https://huggingface.co/williamTLmiller/batterymhm
cd batterymhm
pip install -e ".[dev]" # ".[dev]" adds pytest, ruff, and xgboost
Requirements: Python β₯ 3.9 and numpy, scipy, scikit-learn (XGBoost is
optional β the ensemble falls back to ExtraTrees-only without it).
2. Predict cell state-of-health from early cycles
import numpy as np
from batterymhm import seq_to_harmonics, mhm_full_features, MHMEnsemble, compute_metrics
def featurize(curves):
dicts = [mhm_full_features(seq_to_harmonics(list(c), bins=9)) for c in curves]
keys = sorted(dicts[0]) # stable column order
return np.array([[d[k] for k in keys] for d in dicts]), keys
# curves = list of early-cycle capacity arrays; y = SOH labels (your data)
X, keys = featurize(curves)
model = MHMEnsemble().fit(X[:train], y[:train], feature_names=keys)
pred = model.predict(X[train:])
print(compute_metrics(y[train:], pred)) # MAE / RMSE / PCC / RΒ²
print(model.top_features(8)) # which harmonic features mattered
3. Build a harmonic descriptor for a crystal composition
from batterymhm import element_hin, mhm_matter8_neighbor_histograms
elements = ["Li", "Fe", "P", "O", "O", "O", "O"] # LiFePO4
hins = [element_hin(e) for e in elements] # fold atomic numbers β HINs
feats = mhm_matter8_neighbor_histograms(hins, hins) # 274-d descriptor
4. Run the ready-made examples
python demo.py # offline proof it works (cells + materials)
python examples/predict_soh.py # full SOH training example
python examples/materials_descriptor.py # materials descriptor example
make test # run the test suite
No weights are shipped β you train your own (it takes seconds on CPU). The published Severson / Matbench numbers are reproducible with the public datasets linked below. Deep dive into the math:
docs/METHOD.md.
π¦ What's in the box
| β The complete method β algebra, Chi matrix, feature library, ensemble | π¬ batterymhm/ |
| β A 5-second offline demo proving it carries signal | βΆοΈ demo.py |
| β 7 passing tests so you can trust it | π§ͺ tests/ |
| β Works CPU-only, no downloads, no GPU | π» |
| β No trained weights, no proprietary data | (train your own β it's easy) |
π Reproduce the benchmarks (public data)
The method here, plus these public datasets, reproduces the numbers above:
- Cell SOH β MITβStanfordβTRI: https://data.matr.io/1/projects/5c48dd2bc625d700019f3204
- Materials β Matbench
mp_e_form: https://matbench.materialsproject.org (auto-loads viamatminer)
Materials track β honest framing
On crystal formation energy (Matbench mp_e_form), the harmonic descriptor scores
MAE 0.1513 eV/atom β it beats the classic RF + Magpie baseline (0.132) but does
not beat modern graph neural networks (CGCNN 0.049 β CHGNet 0.015). The materials
track is a discovery-pipeline component; the SOTA result is cell SOH. We'd rather
tell you that up front than oversell.
π― Who it's for
Battery researchers, EV / grid-storage engineers, materials-discovery teams, and ML folks who want a transparent, fast, CPU-only baseline that's genuinely competitive β and a clean harmonic-feature toolkit to build on.
Intended use: non-commercial research and education. Not a substitute for physical testing. The bundled demo is synthetic (a signal check); real performance comes from training on the public datasets above.
π License & patent
Licensed under CC BY-NC 4.0 β share and adapt for non-commercial purposes with attribution to William T. L. Miller.
The Miller Harmonic Method (the fold map, the compatibility-matrix scoring, the
phase-coherence rule, and the multi-scale Miller-sequence aggregation) is patent
pending. CC BY-NC 4.0 is a copyright license and grants no patent rights;
commercial use of the method may require a separate patent license from the inventor.
See LICENSE.
π£ Cite
@software{miller_batterymhm_2026,
author = {Miller, William T. L.},
title = {BatteryMHM: The Miller Harmonic Method for Battery Science},
year = {2026},
license = {CC-BY-NC-4.0},
url = {https://huggingface.co/williamTLmiller/batterymhm},
note = {Open method release; patent pending}
}
Evaluation results
- MAE (5-fold CV, 30% observation window) on MIT-Stanford-TRI (Severson et al., Nature Energy 2019)self-reported0.011
- RMSE on MIT-Stanford-TRI (Severson et al., Nature Energy 2019)self-reported0.020