AI & ML interests

Singular Learning Theory & Developmental Interpretability

Organization Card
About org cards

Timaeus

About Us

Timaeus is a non-profit research organization founded in October 2023, focused on applying singular learning theory (SLT) to AI alignment. Our mission is to make fundamental breakthroughs in technical AI alignment using deep ideas from mathematics and the sciences.

Our Research Focus

We concentrate on understanding the relationship between internal structure in neural networks and the geometry of the loss landscape, as revealed by singular learning theory. This connection provides a basis for developing scalable tools for interpretability, mechanistic anomaly detection, and beyond.

Key Research Areas:

  1. Developmental Interpretability (DevInterp): Applying SLT to interpret the development of structure in neural networks, aiming to identify when, where, and what circuits form.

  2. Structural Generalization (StrucGen): Using SLT to study out-of-distribution generalization, with the goal of building tools to predict how circuits will "break".

  3. Geometry of Program Synthesis (GPS): Applying SLT to study inductive biases, advancing our understanding of how to predict and measure alignment-relevant risks.

Notable Achievements

  • Established developmental interpretability as a concrete application of SLT to alignment
  • Developed scalable new measuring tools like the Local Learning Coefficient (LLC)
  • Validated that SLT can make accurate predictions about real-world AI systems
  • Popularized SLT within the AI safety community through conferences, workshops, and collaborations

Key Publications

  • Quantifying Degeneracy in Singular Models via the learning coefficient (Lau et al. 2023)
  • Estimating the Local Learning Coefficient at Scale (Furman and Lau 2024)
  • The Developmental Landscape of In-Context Learning (Hoogland et al. 2024)

Resources

Connect With Us

For collaboration inquiries or more information about our research, please contact us at contact@timaeus.co

models

None public yet