Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
2
9
8
Piotr Nawrot
pnawrot
Follow
John6666's profile picture
dj1982's profile picture
akhaliq's profile picture
4 followers
·
1 following
https://piotrnawrot.github.io
p_nawrot
PiotrNawrot
piotr-nawrot
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 1 month ago
Self-Improving World Modelling with Latent Actions
liked
a model
about 2 months ago
g023/Qwen3-8B-DMS-8x-4bit-NF4
posted
an
update
about 2 months ago
We’ve just released Qwen3-8B-DMS-8x fine-tuned for 8x KV cache compression. It maintains dense model accuracy on demanding tasks like AIME24, and is perfect for inference-time scaling. The code on HF works out-of-the-box. With DMS we fine-tune models end-to-end via distillation; this works much better than “token importance” proxies found in usual eviction methods. It’s state-of-art for KV eviction tailored for fast inference: adds negligible amount of parameters and computation to each KV head, and requires as little as 1K fine-tuning steps to reach 8x compression. It speeds-up both prefill and generation phase of Transformer LLMs, and can be combined with Sparse Attention methods such as DSA. 🎓Paper - https://neurips.cc/virtual/2025/loc/san-diego/poster/119605 💾 Checkpoint - https://huggingface.co/nvidia/Qwen3-8B-DMS-8x 📢 Article - https://ed.ac.uk/news/shrinking-ai-memory-boosts-accuracy
View all activity
Organizations
pnawrot
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
about 2 months ago
g023/Qwen3-8B-DMS-8x-4bit-NF4
Text Generation
•
8B
•
Updated
Jan 31
•
211
•
1
liked
2 models
2 months ago
pnawrot/nanoT5-base
Updated
Apr 26, 2025
•
5
•
11
nvidia/Qwen3-8B-DMS-8x
Updated
Jan 22
•
1.09k
•
34
liked
4 models
about 1 year ago
nvidia/Llama-2-7B-DMC-4x
Updated
Dec 22, 2024
•
2
nvidia/Llama-2-7B-DMC-8x
Updated
Dec 22, 2024
•
2
nvidia/Llama-2-13B-DMC-4x
Updated
Dec 22, 2024
•
1
nvidia/Llama-2-13B-DMC-8x
Updated
Dec 22, 2024
•
2
liked
a model
about 2 years ago
LazarusNLP/IndoNanoT5-base
0.2B
•
Updated
Feb 12, 2024
•
45
•
2