Qwen/Qwen3-8B — lems search — 0.9 target ratio

This model was compressed using kfac_svd with lems rank search starting from Qwen/Qwen3-8B as base model. You may check out our publication and project page for details on kfac-svd and our LEMS rank search.

Compression Details

Metric Value
Base Model Qwen/Qwen3-8B
Method kfac_svd
Search Method lems
Target Ratio 0.9
Compression Metric params
Recommended Dtype float16
Compressed Layers 67
Total Parameters 7,496,127,555

Usage

The checkpoint records its recommended dtype in config.json; no explicit torch_dtype argument should be needed with this remote-code wrapper. For standard Transformers models, torch_dtype="auto" is the portable fallback.

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "MoritzMo123/kfac-svd_lems_Qwen3-8B_0.9",
    trust_remote_code=True,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("MoritzMo123/kfac-svd_lems_Qwen3-8B_0.9")

inputs = tokenizer('Hello, ', return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation Results

Dataset Perplexity
wikitext2 9.85
ptb 15.77
c4 16.04

Rank Allocation

Per-layer ranks (click to expand)
Layer Rank
model.layers.0.mlp.up_proj 2312
model.layers.1.mlp.down_proj 1960
model.layers.1.mlp.gate_proj 1640
model.layers.1.self_attn.o_proj 1056
model.layers.1.self_attn.q_proj 624
model.layers.11.self_attn.q_proj 1608
model.layers.12.self_attn.o_proj 1648
model.layers.12.self_attn.q_proj 1216
model.layers.13.self_attn.q_proj 1336
model.layers.14.self_attn.q_proj 1064
model.layers.15.self_attn.q_proj 896
model.layers.16.self_attn.q_proj 1048
model.layers.17.mlp.gate_proj 1952
model.layers.17.self_attn.q_proj 1040
model.layers.18.mlp.gate_proj 1888
model.layers.18.self_attn.q_proj 1192
model.layers.19.mlp.gate_proj 1832
model.layers.19.self_attn.q_proj 968
model.layers.2.mlp.down_proj 1464
model.layers.2.mlp.gate_proj 1808
model.layers.2.mlp.up_proj 1192
model.layers.2.self_attn.o_proj 784
model.layers.2.self_attn.q_proj 728
model.layers.20.mlp.gate_proj 1896
model.layers.20.mlp.up_proj 2120
model.layers.20.self_attn.q_proj 952
model.layers.21.mlp.gate_proj 2192
model.layers.21.self_attn.q_proj 1256
model.layers.22.self_attn.o_proj 1616
model.layers.22.self_attn.q_proj 856
model.layers.23.self_attn.q_proj 1144
model.layers.24.self_attn.q_proj 904
model.layers.25.self_attn.o_proj 1464
model.layers.25.self_attn.q_proj 984
model.layers.26.self_attn.o_proj 1264
model.layers.26.self_attn.q_proj 720
model.layers.27.self_attn.o_proj 1176
model.layers.27.self_attn.q_proj 712
model.layers.28.self_attn.o_proj 1248
model.layers.28.self_attn.q_proj 872
model.layers.29.self_attn.o_proj 1272
model.layers.29.self_attn.q_proj 672
model.layers.29.self_attn.v_proj 680
model.layers.3.mlp.up_proj 1832
model.layers.3.self_attn.o_proj 1128
model.layers.3.self_attn.q_proj 616
model.layers.30.self_attn.o_proj 1288
model.layers.30.self_attn.q_proj 712
model.layers.31.self_attn.o_proj 1288
model.layers.31.self_attn.q_proj 680
model.layers.32.self_attn.o_proj 1240
model.layers.32.self_attn.q_proj 672
model.layers.33.self_attn.q_proj 712
model.layers.34.self_attn.o_proj 1328
model.layers.34.self_attn.q_proj 640
model.layers.35.self_attn.o_proj 1000
model.layers.35.self_attn.q_proj 632
model.layers.4.self_attn.o_proj 1248
model.layers.4.self_attn.q_proj 864
model.layers.5.self_attn.o_proj 1176
model.layers.5.self_attn.q_proj 1208
model.layers.6.self_attn.q_proj 768
model.layers.7.self_attn.o_proj 1368
model.layers.7.self_attn.q_proj 808
model.layers.8.self_attn.q_proj 1168
model.layers.9.self_attn.o_proj 1320
model.layers.9.self_attn.q_proj 1352

Hydra Configuration Summary

Config Field Value
Model Qwen/Qwen3-8B
SVD Method kfac_svd
Search Method lems
Compression Target 0.9
Target Metric params
Calibration Dataset wikitext2
Sequence Length 2048
Seed 42
Downloads last month
30
Safetensors
Model size
7B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MoritzMo123/kfac-svd_lems_Qwen3-8B_0.9

Finetuned
Qwen/Qwen3-8B
Finetuned
(1655)
this model

Dataset used to train MoritzMo123/kfac-svd_lems_Qwen3-8B_0.9

Collection including MoritzMo123/kfac-svd_lems_Qwen3-8B_0.9