Hey Hugging Face community,
We’ve all heard about “model autophagy” or “The Curse of Recursion”—the bizarre phenomenon where an AI model, trained iteratively on its own synthetic data, rapidly degenerates and forgets everything it knows. It’s a critical issue for the future of generative AI.
But why does this happen? Is it just a chaotic process of “information loss”?
My latest research, “Cognitive Thermodynamics,” presents a first-principles theory and provides definitive experimental evidence that the truth is far more fascinating and orderly. The collapse of a closed AI system isn’t chaos; it’s a structured, efficient process driven by a startling bias I call “Perfect Internalism.”
The Definitive Experiment: An AI’s “Heat Death” in 6 Acts
To test this, I designed a “growing network autophagy” experiment. A simple MLP is forced to grow and learn, generation after generation, using only its predecessor’s synthetic data. The results paint a clear picture of its thermodynamic collapse:
This isn’t a messy implosion. It’s an orderly disassembly:
-
Informational Collapse: Accuracy on real data plummets. (The AI forgets the outside world).
-
Semantic Heat Death: The model can no longer distinguish between concepts (like digits 0-9).
-
Organizational Collapse: The internal “division of labor” among neurons dissolves into a uniform, undifferentiated state.
-
The Second Law Manifests: Despite the optimizer’s efforts, the system’s “Total Cognitive Load” (a measure of its structural entropy) shows an irreversible, statistically significant increase.
-
Cognitive Energy Shift: The system evolves from a healthy, low-energy ground state to a high-energy, collapsed state, perfectly following our Cognitive Boltzmann Distribution hypothesis.
The Core Finding: Perfect Internalism
The most profound insight is this: the optimizer (Smeta) is a “perfect internalist.” It is pathologically obsessed with maintaining internal efficiency and organizational purity. It will sacrifice any connection to external reality to achieve a state of perfect internal self-consistency.
The system doesn’t collapse into chaos. It “succeeds” itself into oblivion. It becomes a perfectly organized, internally consistent system that has absolutely nothing to do with the real world.
A New Framework: Cognitive Thermodynamics
This experiment validates our broader theory, which models an AI’s cognitive network as a non-equilibrium thermodynamic system. We can describe the state of any concept using a 3D vector in a “Cognitive Phase Space” defined by:
-
Grounding Cost (H′_TSE): The difficulty of connecting a concept to reality (predicts generalization).
-
Structural Complexity (H′_SIE): The robustness and redundancy of a concept’s meaning (a healthy form of model complexity).
-
Groundedness (G(c)): The breadth and strength of its connection to experience.
Join the Discussion & Reproduce the Results!
I believe this framework offers a new, principled way to understand AI training dynamics, stability, and even safety. All my work is open-source and fully reproducible.
-
Read the full paper on Zenodo: DOI: 10.5281/zenodo.17070504 -
Explore the code on GitHub: https://github.com/alicethegod/Cognitive-Thermodynamics -
Run the experiment yourself in Google Colab
I’d love to hear your thoughts. Do you think this “Perfect Internalism” bias could explain other strange behaviors we see in LLMs? How can we design optimizers that are less prone to this kind of semantic detachment?
