Abstract/Context:
I am an independent researcher (Exorobourii LLC) seeking an endorsement to submit my latest paper to the cs.AI (Artificial Intelligence) or cs.LG (Machine Learning) category on arXiv.
In Project Janus Part I, I demonstrated that “Vector Space Homeostasis” (VSM)—a technique enforcing orthogonal regularization—improved logical coherence in 40M parameter models. In this paper (Part II), I move from behavioral metrics to causal evidence, utilizing Sparse Autoencoders (SAEs) to decompose the model’s internal representations.
Key Findings:
Mechanistic Validation: We observed a 60.3% reduction in inter-head correlation, confirming that VSM physically disentangles representation subspaces.
The “Sparsity Crossover”: SAE analysis revealed a novel “Filter-then-Pack” strategy: adaptive models are significantly sparser in early layers (noise filtering) but denser in deep layers (semantic packing) compared to controls.
Load-Bearing Heads: Ablation studies identified the emergence of “Super-Heads” that carry 2.54x the functional importance of control heads, marking a shift from redundant syntactic scaffolding to distinct semantic binding.
Reproducibility & Artifacts:
I believe strongly in “Glass Box” research. To that end, I have released the “Janus Mechanistic Workbench,” which includes:
Model Checkpoints: Full PyTorch state dictionaries for both the Janus-v3-Control and Janus-v3-Adaptive models.
Telemetry Data: The raw CSV outputs from our topology analysis, zero-mask ablation studies, and the Sparse Autoencoder (SAE) feature density
metrics.
