s3nh's picture

Open to Collab

s3nh PRO

s3nh

·

s3nhxx
s3nh

AI & ML interests

Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh

Recent Activity

reacted to codelion's post with 🔥 about 11 hours ago

Scaling Pedagogical Pre-training to 10 Billion Tokens New blog post exploring what happens when you take optimal data mixing insights and scale up the data generation itself. We built Sutra, a multi-stage framework for generating pedagogical pre-training data guided by a knowledge graph of ~2,000 concepts across 9 domains. The pipeline includes structured content generation, six-dimension quality evaluation, diversity management across 20 content styles, and a cleaning stage to prevent collapse. The result is https://huggingface.co/datasets/codelion/sutra-10B, a 10.2 billion token pedagogical dataset with rich metadata (domain, complexity, prerequisites, quality scores) on every entry. We trained https://huggingface.co/codelion/SmolLM2-70M on it for 3 full epochs (30.6B tokens) on a single A10 GPU in ~78 hours. Key finding: perplexity kept improving across epochs, but benchmark gains plateaued fast. At 70M parameters, the model hits a representational ceiling that more data alone can't break through. Full writeup with comparisons against 7 other datasets, detailed benchmark breakdowns, and connections to recent work on synthetic data scaling, curriculum learning, and data mixing laws: https://huggingface.co/blog/codelion/scaling-pedagogical-pretraining-10-billion-tokens All datasets at multiple scales (10M, 100M, 1B, 10B) plus seed concepts and an SFT variant are in the Sutra Pedagogical Datasets collection.

reacted to blanchon's post with ❤️ 3 days ago

I'm releasing OpenCS2 a 11TB dataset of around 5000 hours of counter strike gameplay recording. - HD resolution - 1280×720 · 32 fps - For each frame keyboard and mouse + world state (player position, velocity, weapon ...) - HD Stereo audio - All 10 players perspective https://huggingface.co/collections/blanchon/opencs2

liked a model 3 days ago

Pranavz/MythoMax-L2-13b-heretic

View all activity

Organizations

liked 2 models 3 days ago

Pranavz/MythoMax-L2-13b-heretic

13B • Updated 10 days ago • 41 • 1

unsloth/Qwen3.6-27B-MTP-GGUF

Image-Text-to-Text • 27B • Updated about 19 hours ago • 185k • 209

liked a model 4 days ago

unsloth/Qwen3.6-35B-A3B-MTP-GGUF

Image-Text-to-Text • 36B • Updated about 19 hours ago • 181k • 189

liked 2 models 5 days ago

Malum0x/mlp-surgery-restored-top30

3B • Updated 10 days ago • 92 • 2

Malum0x/mlp-surgery-restored-specificity-top10

3B • Updated 10 days ago • 16 • 1

liked 3 models 12 days ago

SulphurAI/Sulphur-2-base

Text-to-Video • 9B • Updated 8 days ago • 970k • 1.04k

poolside/Laguna-XS.2

Text Generation • 33B • Updated 9 days ago • 46.4k • 255

Xerv-AI/MAXWELL

Text Generation • 2B • Updated 12 days ago • 660 • 5

liked 3 models 26 days ago

stamsam/FrankenGemma4

Text Generation • 1B • Updated 27 days ago • 200 • 4

YoAbriel/KodaLite-1.3B

Text Generation • 1B • Updated 13 days ago • 341 • 3

tencent/HY-Embodied-0.5

Image-Text-to-Text • 4B • Updated Apr 14 • 1.92k • 906

liked a model about 1 month ago

netflix/void-model

Video-to-Video • Updated Apr 6 • 925

liked 3 models about 2 months ago

HuggingFaceTB/SmolLM3-3B

Text Generation • 3B • Updated Sep 10, 2025 • 229k • 953

HauhauCS/Qwen3.5-4B-Uncensored-HauhauCS-Aggressive

4B • Updated Apr 5 • 186k • 337

RoyalCities/Foundation-1

Updated Mar 16 • 333

liked 2 models 3 months ago

ysong21/entropy-v1-fp8

Text Generation • 27B • Updated Feb 18 • 41 • 6

raincandy-u/Rain-v2

Text Generation • 0.1B • Updated Mar 19 • 11 • 5

liked a model 4 months ago

raincandy-u/Rain-100M

Text Generation • 97.2M • Updated Jan 24 • 27 • 18

liked a Space 4 months ago

HeartMuLa

A Family of Open Sourced Music Foundation Models

liked a model 4 months ago

codelion/dhara-70m

Text Generation • 71.3M • Updated Dec 30, 2025 • 208 • 48