Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Open to Collab
90.0
TFLOPS
88
17
278
s3nh
PRO
s3nh
Follow
LEMENT's profile picture
ksiabani's profile picture
regisss's profile picture
255 followers
·
110 following
s3nhxx
s3nh
AI & ML interests
Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh
Recent Activity
reacted
to
codelion
's
post
with 🔥
about 11 hours ago
Scaling Pedagogical Pre-training to 10 Billion Tokens New blog post exploring what happens when you take optimal data mixing insights and scale up the data generation itself. We built Sutra, a multi-stage framework for generating pedagogical pre-training data guided by a knowledge graph of ~2,000 concepts across 9 domains. The pipeline includes structured content generation, six-dimension quality evaluation, diversity management across 20 content styles, and a cleaning stage to prevent collapse. The result is https://huggingface.co/datasets/codelion/sutra-10B, a 10.2 billion token pedagogical dataset with rich metadata (domain, complexity, prerequisites, quality scores) on every entry. We trained https://huggingface.co/codelion/SmolLM2-70M on it for 3 full epochs (30.6B tokens) on a single A10 GPU in ~78 hours. Key finding: perplexity kept improving across epochs, but benchmark gains plateaued fast. At 70M parameters, the model hits a representational ceiling that more data alone can't break through. Full writeup with comparisons against 7 other datasets, detailed benchmark breakdowns, and connections to recent work on synthetic data scaling, curriculum learning, and data mixing laws: https://huggingface.co/blog/codelion/scaling-pedagogical-pretraining-10-billion-tokens All datasets at multiple scales (10M, 100M, 1B, 10B) plus seed concepts and an SFT variant are in the Sutra Pedagogical Datasets collection.
reacted
to
blanchon
's
post
with ❤️
3 days ago
I'm releasing OpenCS2 a 11TB dataset of around 5000 hours of counter strike gameplay recording. - HD resolution - 1280×720 · 32 fps - For each frame keyboard and mouse + world state (player position, velocity, weapon ...) - HD Stereo audio - All 10 players perspective https://huggingface.co/collections/blanchon/opencs2
liked
a model
3 days ago
Pranavz/MythoMax-L2-13b-heretic
View all activity
Organizations
s3nh
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
2 models
3 days ago
Pranavz/MythoMax-L2-13b-heretic
13B
•
Updated
10 days ago
•
41
•
1
unsloth/Qwen3.6-27B-MTP-GGUF
Image-Text-to-Text
•
27B
•
Updated
about 19 hours ago
•
185k
•
209
liked
a model
4 days ago
unsloth/Qwen3.6-35B-A3B-MTP-GGUF
Image-Text-to-Text
•
36B
•
Updated
about 19 hours ago
•
181k
•
189
liked
2 models
5 days ago
Malum0x/mlp-surgery-restored-top30
3B
•
Updated
10 days ago
•
92
•
2
Malum0x/mlp-surgery-restored-specificity-top10
3B
•
Updated
10 days ago
•
16
•
1
liked
3 models
12 days ago
SulphurAI/Sulphur-2-base
Text-to-Video
•
9B
•
Updated
8 days ago
•
970k
•
1.04k
poolside/Laguna-XS.2
Text Generation
•
33B
•
Updated
9 days ago
•
46.4k
•
255
Xerv-AI/MAXWELL
Text Generation
•
2B
•
Updated
12 days ago
•
660
•
5
liked
3 models
26 days ago
stamsam/FrankenGemma4
Text Generation
•
1B
•
Updated
27 days ago
•
200
•
4
YoAbriel/KodaLite-1.3B
Text Generation
•
1B
•
Updated
13 days ago
•
341
•
3
tencent/HY-Embodied-0.5
Image-Text-to-Text
•
4B
•
Updated
Apr 14
•
1.92k
•
906
liked
a model
about 1 month ago
netflix/void-model
Video-to-Video
•
Updated
Apr 6
•
925
liked
3 models
about 2 months ago
HuggingFaceTB/SmolLM3-3B
Text Generation
•
3B
•
Updated
Sep 10, 2025
•
229k
•
953
HauhauCS/Qwen3.5-4B-Uncensored-HauhauCS-Aggressive
4B
•
Updated
Apr 5
•
186k
•
337
RoyalCities/Foundation-1
Updated
Mar 16
•
333
liked
2 models
3 months ago
ysong21/entropy-v1-fp8
Text Generation
•
27B
•
Updated
Feb 18
•
41
•
6
raincandy-u/Rain-v2
Text Generation
•
0.1B
•
Updated
Mar 19
•
11
•
5
liked
a model
4 months ago
raincandy-u/Rain-100M
Text Generation
•
97.2M
•
Updated
Jan 24
•
27
•
18
liked
a Space
4 months ago
Runtime error
Agents
55
HeartMuLa
🚀
55
A Family of Open Sourced Music Foundation Models
liked
a model
4 months ago
codelion/dhara-70m
Text Generation
•
71.3M
•
Updated
Dec 30, 2025
•
208
•
48
Load more