YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

bitserial-modmul โ€” learned modular multiplication

Submission for the SAIR Modular Arithmetic Challenge. One shared, p-conditioned recurrent cell applied in a fixed bit-serial (Horner) loop computes (a * b) mod p. The cell learns the per-step transition s' = (2*s + d*x) mod p, including the modular wrap; the loop only sequences bits. Answers are emitted as base-2 digits and the harness decoder reconstructs the integer.

Local evaluation

modchallenge evaluate, 1100 problems, secret-seed unseen primes (tiers 2+):

  • Tiers 1-7: exact-match 1.00 each.
  • highest_tier_above_90 = 7, overall_accuracy = 0.706.
  • Identical on two independent seeds.

Provenance

The capability is in the trained parameters: randomizing the weights collapses every solved tier to 0.00 (overall 0.006). No symbolic-math libraries, no big-integer modular arithmetic in Python, no lookup tables. The reduction and the multiplication are performed by the trained cell; the Python loop performs no arithmetic (only bit indexing and feeding the cell). The static-analysis check passes. Each preprocessing hook sees only its own argument.

The cell is a ~471K-parameter bidirectional GRU. It was trained from random initialization on one-step modular transitions (modulus bit-length stratified, wrap-boundary cases oversampled), with lr warmup + cosine decay. A single L=256 cell covers tiers 1-7. See manifest.json for the full architecture and training summary.

Interface

  • entry_class: model.BitSerialReducer
  • output_base: 2
  • Files: model.py, manifest.json, weights.pt.

Limitation (honest)

This model passes the random-operand benchmark but is not exact. On structured inputs (powers of two and other long doubling chains) the per-step reduction drifts for some primes beyond about 500 steps, reproducing the Neural GPU limitation (Price, Zaremba, Sutskever 2016). The benchmark tiers reflect average-case accuracy on the official scorer's random-operand distribution, not worst-case exactness of the underlying modular-multiplication operator.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support