Motif-Technologies
/

optimizer

Model card Files Files and versions

optimizer / torch-ext

Commit History

feat: extend QK-Clip to support MLA (MuonClip Algorithm 1) [skip-build] (#28)

e8e2c81
unverified

dongseokmotif Claude Sonnet 4.6

wyldecat github-actions[bot] commited on 5 days ago

Revert "fix: disable CUDA graphs in Newton-Schulz for cpu_offload compatibility" (#29)

313d56a
unverified

wyldecat github-actions[bot] commited on 5 days ago

fix: disable CUDA graphs in Newton-Schulz for cpu_offload compatibility

2dce952

wyldecat Claude Opus 4.6 (1M context) commited on 8 days ago

Replace cpu_offload constructor param with turn_on/turn_off API (#26)

05a75f1
unverified

wyldecat Claude Opus 4.6 (1M context) github-actions[bot] commited on 10 days ago

Invalidate AdamW tensor caches on load_state_dict [skip-build]

89b6099

ca1207 Claude Opus 4.6 (1M context) commited on 12 days ago

draft commit for cpu_offload (#23)

10848ab
unverified

TaehyunKim github-actions[bot]

wyldecat Claude Opus 4.6 (1M context) commited on 13 days ago

Update fast path comment to reflect current behavior [skip-build]

7e33533

wyldecat Claude Opus 4.6 commited on 18 days ago

Update comment to reflect use_local_synchronization behavior [skip-build]

3f5cf49

wyldecat Claude Opus 4.6 commited on 18 days ago

Fix deadlock in construct_shard_mesh with PP + dp_replicate > 1

da7e5da

wyldecat Claude Opus 4.6 commited on 18 days ago

Muon optimizer: expert batching, parallel caching, A2A overlap [skip-build]

0f37d63

wyldecat Claude Opus 4.6 commited on 28 days ago

Optimize pipeline: batched update, zero-copy scatter, prelaunch gather [skip-build]

2816b64

wyldecat Claude Opus 4.6 commited on 28 days ago

Cache AdamW placement grouping and tensor lists [skip-build]

8ca2492

wyldecat Claude Opus 4.6 commited on 28 days ago

Add torch.compile, CUDA graph, and compiled momentum [skip-build]

e74d98f

wyldecat Claude Opus 4.6 commited on 28 days ago

Add mhc_attn, mhc_ffn, lambda_proj to skip_keys

ba293d0

wyldecat Claude Opus 4.6 commited on 29 days ago

Remove verbose param_groups summary logging

24f0957

wyldecat Claude Opus 4.6 commited on 29 days ago

Support multi-component expert_keys (e.g. "experts.w1")

5a99e12

wyldecat Claude Opus 4.6 commited on 29 days ago

Extract is_expert_param() helper to consolidate expert key matching

e615b1c

wyldecat Claude Opus 4.6 commited on 29 days ago

Include original (pre-normalize) FQN in is_muon logging

135fc66

wyldecat Claude Opus 4.6 commited on 29 days ago

Add info-level logging for param group classification (Muon vs AdamW)

1118752

wyldecat Claude Opus 4.6 commited on 29 days ago

Use component-level matching for expert_keys to avoid shared_experts collision

f008017

wyldecat Claude Opus 4.6 commited on 29 days ago

Normalize parameter FQNs to handle torch.compile / checkpoint wrappers

95a620f

wyldecat Claude Opus 4.6 commited on 30 days ago

Apply pre-commit formatting (yapf) [skip-build]

bf30b9b

dongseokmotif Claude Sonnet 4.6 commited on Feb 28

Add max_iter cap and non-finite checks to _optimal_quintic [skip-build]

206b280

dongseokmotif commited on Feb 28

Apply pre-commit formatting (yapf, isort) [skip-build]

aff01db

dongseokmotif commited on Feb 27

Add comment explaining _coeffs_list and Polar Express vs former NS [skip-build]

abaa449

dongseokmotif Claude Sonnet 4.6 commited on Feb 27

Replace hardcoded NS coefficients with analytically optimal ones [skip-build]

573242f

dongseokmotif Claude Sonnet 4.6 commited on Feb 26

Refactor pipeline to async generator pattern (#16)

33929c0
unverified

wyldecat github-actions[bot] commited on Feb 26

Support mHC (#15)

ae32572
unverified

wyldecat github-actions[bot] commited on Jan 16

Support param group with various placements (#13)

e2b41e5
unverified

wyldecat github-actions[bot] commited on Nov 7, 2025

fix bug in fsdp

811726c

ca1207 commited on Oct 23, 2025

Update torch-ext/optimizer/muon.py

b0230e7
unverified

TaehyunKim commited on Oct 2, 2025

Update torch-ext/optimizer/muon.py

ff2fcfb
unverified

TaehyunKim commited on Oct 2, 2025

Update muon.py

c16b438
unverified

TaehyunKim commited on Oct 2, 2025

fix assert in a2a gather scatter

3dafb3e

ca1207 commited on Sep 29, 2025

delete state in split_func

15336dc

ca1207 commited on Sep 26, 2025

change owner_params to owned_params

6943c45

ca1207 commited on Sep 26, 2025

modify pre step (overlap step) can get from arsgs

589b763

ca1207 commited on Sep 26, 2025

add doc strings + init self rank on init_assign_params

267e8a0

ca1207 commited on Sep 26, 2025

license added for flash_muon

d7cd571

ca1207 commited on Sep 25, 2025

apply pre-commit hook

fceb334

dongseokmotif commited on Sep 25, 2025

consider multi node

39c42e0

dongseokmotif commited on Sep 25, 2025

misc

35894d1

ca1207 commited on Sep 24, 2025

use inpalce op in update_g

6e9baad

ca1207 commited on Sep 24, 2025

use COMM_DTYPE instead of hardcoded dtype

2a8631f

ca1207 commited on Sep 24, 2025

apply all2all scatter gather

ff6d675

ca1207 commited on Sep 24, 2025

feat(muon_clip) : add muon clip (#6)

d65066c
unverified

dongseokmotif dongseokmotif github-actions[bot] commited on Sep 24, 2025

feat(muon) : add tuned-abc-values & blfoat16 communication

f7faa93

wyldecat commited on Sep 18, 2025

feat: update muon to receive paramgroups, not model (#4)

b0f46c7
unverified

leejunhyeok junhyeok.lee

wyldecat commited on Sep 11, 2025

fix(muon): add update_p stage and dealloc tensors properly

99e7c0c

wyldecat commited on Sep 9, 2025

chore: add .gitignore

79fc8ba

wyldecat commited on Sep 5, 2025