WFGY 3.0 “Singularity Demo”: one TXT, 131 S-class tests (with 1.0 / 2.0 recap)

OneStarDao · February 3, 2026, 6:39pm

Hi everyone,

some of you might have seen my WFGY repo around GitHub / Huggingface forum before, or small discussions about “tension metrics” for LLMs.

I wanted to share a compact recap of WFGY 1.0 and 2.0 first, and then quietly point to WFGY 3.0, which is now live in the same main repo as a text-only “Singularity Demo”.

This is not a product launch, more like: here is a reproducible framework, if you want to stress-test your own models or prompts.

WFGY 1.0 – PDF experiment, treat an LLM like a self-healing system

WFGY 1.0 started as a ~30-page PDF called “All Principles Return to One”.

The idea was to treat an LLM as something that can “self-heal” at the text level, without touching weights, by running a loop of four modules (BBMC, BBPF, BBCR, BBAM) on top of it.

We tested ten benchmarks (MMLU, GSM8K, BBH, MathBench, TruthfulQA, etc).

Very rough numbers, just to give a sense of scale:

MMLU accuracy: baseline ~68.2% → with full WFGY 1.0 ~91.4%
GSM8K accuracy: baseline ~45.3% → with WFGY 1.0 ~84.0%
mean time-to-failure in long runs: from 1.0 → around 3.6×

The PDF came with logs and a DOI-style report.
The point of 1.0 was simply: show that a pure “text overlay” can actually move stability and reasoning in a measurable way, not just as a nice-sounding story.

WFGY 2.0 – Core Flagship + 16-problem checklist for RAG / agents

In WFGY 2.0, I moved from “PDF theory” to something people can plug into daily work.

Two main pieces:

The core was compressed into a single tension metric
delta_s = 1 − cos(I, G) with four zones: safe / transit / risk / danger.
(I = intention, G = generated behavior at that step.)
On top of that, I started the WFGY ProblemMap and a 16-problem list,
focused on real-world failures: RAG retrieval issues, vector store drift, prompt injection, bad deployment order, etc.

In practice, a lot of people used that 16-problem list as a debugging checklist:

when your RAG or agent behavior looks wrong,
you map it to one of the 16 failure types,
then apply the suggested fix or guardrail.

All of this is still in the repo.
If you mostly care about RAG, vector DBs, eval and debugging, WFGY 2.0 + the 16-problem checklist is probably the most directly useful part.

WFGY 3.0 – “Singularity Demo” as a TXT pack for LLMs

Now the new part.

WFGY 3.0 is now online in the same main repo (I didn’t open a new one).
The formal name is:

WFGY 3.0 · Singularity Demo

Very conservative description of what 3.0 is:

It packages the Tension Universe / BlackHole layer as 131 “S-class” problems.
It is not a paper, but a TXT pack designed to be read by LLMs.
It is meant as a public, reproducible way to test how far this framework can go, across many domains, using only text.

In this post I will not dive into all the math or internals of 3.0.
If you are curious, I would actually prefer you experience it directly and decide by yourself, instead of reading my whole explanation.

How to run the 3.0 TXT demo with an LLM

The TXT is just a plain text file in the repo.
You can use any LLM that supports file upload (HF Inference Endpoints / Spaces, ChatGPT, local models, etc).

Basic flow:

download the WFGY 3.0 Singularity Demo TXT pack from the repo
start a fresh chat / session with your LLM and upload the TXT
after it loads, type run
it will print a small menu; choose go
let it finish the short demo, and observe how it handles the 131 S-class tension problems

The TXT is fully public and readable as plain text.
If you are worried about prompt injection or safety, you can:

open the TXT in any editor and read it first, or
ask an LLM to statically analyze the text (what it is trying to do, what it will not do), before you actually “run” it.

I pushed 3.0 straight into the same main repo that already has ~1.3k stars, so all my past “credit” is basically sitting on top of this TXT now.

How to use WFGY depending on your interests

Very roughly:

If you like formulas, graphs and benchmarks → start with the WFGY 1.0 PDF.
If you are fighting with RAG / vector DB / agent / deployment issues → use WFGY 2.0 Core + the 16-problem list in ProblemMap as a debugging / design checklist.
If you want to see 131 S-class problems as a stress test for your LLM → download the WFGY 3.0 Singularity Demo TXT and let an LLM run run → go.

If this direction looks interesting to you:

you can fork / star the repo,
or just steal the parts that fit your own pipeline (for example, the RAG problem list or the tension metric).

If it sounds suspicious or too ambitious:

you are very welcome to treat it purely as a test object,
try to break it, falsify it, or show where it collapses.

For me, the best outcome is not “everyone believes the framework”.
The best outcome is: after enough public stress tests,
whatever remains inside WFGY 3.0 is the part that actually survived contact with real users and real models.

WFGY 1.0 to 3.0 all in one same repo (main repo):

Topic		Replies	Views
WFGY 2.0 — My Seven-Step Reasoning Engine (for the open-source community) Beginners	1	316	August 19, 2025
[Paper] WFGY 1.0: A Universal Semantic Kernel for Self-Healing LLMs Awesome paper	8	317	July 15, 2025
I accidentally open-sourced a $1M reasoning engine that solves embedding-space convergence Beginners	6	385	July 13, 2025
WFGY Core 2.0 as a Text-Only Reasoning Layer (System Prompt + A/B/C Harness) Intermediate	0	36	February 13, 2026
What I learned shipping my first OSS AI project: ~500 ⭐ in ~60 days (no funds, no ads, no team) Beginners	0	73	August 12, 2025