An Age of AI Enlightenment

Discovery is the skill of creating and detecting deviations between expectation and observation, to form hypotheses, test them, and refine ideas in light of both past and new evidence. A deviation matters only when a precise framework of expectation reveals the gap and generates a general and testable hypothesis.

Creating Deviation

Today’s AIs are powerful problem solvers, optimized to excel at tasks with well-defined solutions, as demonstrated in competitions like the IMO and ICPC. But discovery demands the opposite orientation: it requires seeking out anomalies, which by nature live in the long tail. Current training regimes instead push models to suppress anomalies rather than create them. Pretraining/SFT aligns models to the observed data distribution, and RL tends to reward mode-seeking within tasks where correctness is well-defined. The result is that an LLM is reinforced for fitting expectations, not for exposing where those expectations break.

RL with an objective to discover new things requires a verifiable definition of what counts as a meaningful anomaly. In fields like math or coding, such definitions can be elusive/subjective. Physical science, however, provides a clearer foundation. A material that superconducts above liquid nitrogen temperature is an anomaly; a strong magnet without rare earth elements is an anomaly; a light yet strong alloy is an anomaly. In each case, the extraordinary property itself defines the deviation from expectation, and verification comes directly through experiment.

Framework of Expectation

Consequential anomalies rarely present themselves in full view; discoveries often begin with subtler, less visible deviations. To detect them—and to propose a unifying hypothesis—we need a framework of expectation that makes weak signals stand out, built across multiple levels of abstraction so evidence from different perspectives reinforces a single coherent story.

As an example, the atomic hypothesis gained credibility not from a single observation but from the convergence of many. Phenomena as varied as gas laws, chemical reactions, Brownian motion, and optical scattering all became more intelligible once atoms were assumed. Each anomaly, limited in scope on its own, reinforced the others across different domains. Together, they converged on a single, consistent underlying reality: the existence of atoms.

Physical science offers a playground where well-defined levels of abstraction exist—from electronic, atomic, continuum, to device. That hierarchy lets us formalize priors at multiple scales and require cross-scale consistency in an interpretable way. A system for discovery can therefore maintain parallel models: from quantum simulations to lab experiments. Any hypothesis needs to stand the test across all these layers. Mastery across all scales is difficult for a person, but may be possible for AI.

Closing thought

Science advances through contradiction, sometimes requiring us to pull apart entire frameworks. As we train AI to accelerate science, science will enlighten AI. Finding a new superconductor would be a huge win, but one day they may surprise us with the kind of paradigm shift that rewrites the rules entirely.

Acknowledgment

I used GPT for drafting and line editing; the ideas and any errors are my own. The atomic hypothesis example was inspired by Ben Recht’s post: Les Atomes.