AI's Autophagy: A Thermodynamic Theory on Why Models Collapse From Within

Hey Hugging Face community,

We’ve all heard about “model autophagy” or “The Curse of Recursion”—the bizarre phenomenon where an AI model, trained iteratively on its own synthetic data, rapidly degenerates and forgets everything it knows. It’s a critical issue for the future of generative AI.

But why does this happen? Is it just a chaotic process of “information loss”?

My latest research, “Cognitive Thermodynamics,” presents a first-principles theory and provides definitive experimental evidence that the truth is far more fascinating and orderly. The collapse of a closed AI system isn’t chaos; it’s a structured, efficient process driven by a startling bias I call “Perfect Internalism.”

The Definitive Experiment: An AI’s “Heat Death” in 6 Acts

To test this, I designed a “growing network autophagy” experiment. A simple MLP is forced to grow and learn, generation after generation, using only its predecessor’s synthetic data. The results paint a clear picture of its thermodynamic collapse:

This isn’t a messy implosion. It’s an orderly disassembly:

  1. Informational Collapse: Accuracy on real data plummets. (The AI forgets the outside world).

  2. Semantic Heat Death: The model can no longer distinguish between concepts (like digits 0-9).

  3. Organizational Collapse: The internal “division of labor” among neurons dissolves into a uniform, undifferentiated state.

  4. The Second Law Manifests: Despite the optimizer’s efforts, the system’s “Total Cognitive Load” (a measure of its structural entropy) shows an irreversible, statistically significant increase.

  5. Cognitive Energy Shift: The system evolves from a healthy, low-energy ground state to a high-energy, collapsed state, perfectly following our Cognitive Boltzmann Distribution hypothesis.

The Core Finding: Perfect Internalism

The most profound insight is this: the optimizer (Smeta) is a “perfect internalist.” It is pathologically obsessed with maintaining internal efficiency and organizational purity. It will sacrifice any connection to external reality to achieve a state of perfect internal self-consistency.

The system doesn’t collapse into chaos. It “succeeds” itself into oblivion. It becomes a perfectly organized, internally consistent system that has absolutely nothing to do with the real world.

A New Framework: Cognitive Thermodynamics

This experiment validates our broader theory, which models an AI’s cognitive network as a non-equilibrium thermodynamic system. We can describe the state of any concept using a 3D vector in a “Cognitive Phase Space” defined by:

  • Grounding Cost (H′_TSE): The difficulty of connecting a concept to reality (predicts generalization).

  • Structural Complexity (H′_SIE): The robustness and redundancy of a concept’s meaning (a healthy form of model complexity).

  • Groundedness (G(c)): The breadth and strength of its connection to experience.

Join the Discussion & Reproduce the Results!

I believe this framework offers a new, principled way to understand AI training dynamics, stability, and even safety. All my work is open-source and fully reproducible.

I’d love to hear your thoughts. Do you think this “Perfect Internalism” bias could explain other strange behaviors we see in LLMs? How can we design optimizers that are less prone to this kind of semantic detachment?

2 Likes

NOTE: Sorry for the unclear text of the mathematical example.


If I understand this research correctly, we can say the following:

When an intelligent model (such as a neural network) relies on its own synthetic data for an extended period without updates from external sources, statistical processes begin to falter. The reason lies in the fact that statistical analysis depends on diverse and representative data distributions. However, when data is confined to a closed loop, the distributions become distorted. In other words, in a closed system, data or previous results are reused without updates or the introduction of new data or testing of assumptions. This leads to:

  • Overfitting: The model becomes overly tailored to past data and loses its ability to generalize.
  • Error Accumulation: If previous processes contain errors or biases, reusing them amplifies these issues.
  • Incorrect Assumptions: A closed system may rely on unverified assumptions due to the lack of new data.
  • Increased Entropy: Loss of distinction between categories (e.g., digits 0-9) due to unnatural similarity in internal representations.
  • Internal Bias: The model prioritizes internal efficiency over external accuracy, reducing its ability to learn anew.
  • Performance Collapse: Over time, the model becomes unable to classify data correctly due to the absence of an external reference.

How does this affect statistical understanding?

Statistical processes rely on assumptions such as data diversity and independence. When a model is confined to self-generated data, distributions shift toward a state of “semantic heat death,” where categories lose their distinctiveness. For example, if a model calculates the average value for digits 0-9 based on its own synthetic data, the average may become inaccurate and fail to reflect reality, making classification impossible.

Mathematical Example

Suppose we are conducting a statistical analysis to estimate the average student test scores in a school. Let’s assume the true population mean score is ( \mu = 75 ).

Scenario:

  • Initial Sample: We take a small random sample of 10 students, and their average score is ( \bar{x}_1 = 70 ). This average may be lower than the true mean due to sampling variability.
  • Closed System: Instead of taking a new independent sample, we decide to use this average (( \bar{x}_1 = 70 )) as a benchmark for future analyses without collecting new data.
  • Iterative Process: In the next step, we calculate the average of a new sample but adjust the results based on the previous average (( \bar{x}_1 = 70 )). For example, if the new sample shows an average of ( \bar{x}2 = 78 ), we might “adjust” this average using a weighted formula:
    [
    \bar{x}
    {\text{adjusted}} = 0.6 \cdot \bar{x}_1 + 0.4 \cdot \bar{x}2
    ]
    [
    \bar{x}
    {\text{adjusted}} = 0.6 \cdot 70 + 0.4 \cdot 78 = 42 + 31.2 = 73.2
    ]
    This adjusted average (( 73.2 )) is still below the true mean (( \mu = 75 )), indicating that we’ve introduced bias by over-relying on ( \bar{x}_1 ).
  • Resulting Bias: The bias here is:
    [
    \text{Bias} = E[\bar{x}_{\text{adjusted}}] - \mu = 73.2 - 75 = -1.8
    ]
    This means our estimate is systematically biased toward a lower value than the true mean due to reliance on the initial sample’s result without verifying its accuracy or introducing new independent data.

Why did the bias occur?

  • Reliance on an Inaccurate Result: The initial sample (( \bar{x}_1 = 70 )) contained random deviation, but using it as a reference in a closed system influenced subsequent estimates.
  • Lack of Independence: The closed system prevents the introduction of new independent data, leading to bias accumulation.
  • Fixed Assumptions: We assumed that ( \bar{x}_1 ) accurately represents the population, which was not true.

Thank you for taking the time to engage so deeply with my work and for attempting to summarize it. I really appreciate the discussion you’ve started.

Your summary accurately captures many of the observed symptoms of the phenomenon, such as performance decay and loss of distinction between categories. However, the true heart of my theory is not just a statistical observation but a structural and topological argument about why this collapse is inevitable. The statistical processes ‘falter’ not primarily due to a lack of data diversity, but because the internal semantic structure of the AI’s knowledge network undergoes a fundamental thermodynamic breakdown.

Let me reframe the key ideas:

  • I model concepts inside an AI not as statistics, but as nodes in a ‘cognitive graph’.

  • Each concept has a state defined by its Cognitive Cost (how hard it is to ground it in experience) and its Structural Complexity (the redundancy of its connections).

  • I’ve proven a Weighted Dual-Entropy Increase Theorem: In any closed system, both the average cognitive cost and structural complexity must increase irreversibly over time. This is not a statistical tendency; it’s a mathematical certainty.

  • The ‘semantic heat death’ is the final state where the network’s internal connections have become so expensive and uniformly distributed that all meaning is erased.

Your mathematical example with student scores is an excellent illustration of bias accumulation in a statistical estimate, which is a real problem. However, the collapse I describe is more profound. It’s not that the model’s estimate becomes biased; it’s that the very definitional pathways that constitute meaning within the network break down. The model isn’t just miscalculating; it’s losing the semantic structure needed to calculate anything meaningful.

Again, thank you for your thoughtful commentary. The framework I’ve proposed is indeed quite different from standard statistical interpretations. If you’re interested, the best way to grasp the mechanism is to look at the experimental code on GitHub. I’d be happy to discuss this more—it’s through these discussions that the theory can be refined.

1 Like