| --- |
| tags: |
| - kernels |
| license: apache-2.0 |
| --- |
| |
| # Optimizer |
|
|
| Optimizer is a python package that provides: |
| - PyTorch implementation of recent optimizer algorithms |
| - with support for parallelism techniques for efficient large-scale training. |
|
|
| ## Currently implemented |
| - Parallel Muon with N-D sharding |
| - [arxiv URL](https://arxiv.org/abs/2511.07464) |
| - Supports **general N-D sharding configurations** |
| - The implementation is not tied to any specific parallel strategy. |
| - Verified from basic FSDP2 setups up to hybrid configurations such as |
| **(2 TP + 2 DP-Replicate + 2 DP-Shard)**. |
| - Verified configurations can be found in [test_muon.py](./test/test_muon.py) |
| |
| ## Usage |
|
|
| ```python |
| import torch |
| from torch.distributed.fsdp import FullyShardedDataParallel as FSDP |
| from kernels import get_kernel |
| |
| optimizer = get_kernel("motif-technologies/optimizer") |
| get_default_muon_param_groups = optimizer.muon.get_default_muon_param_groups |
| |
| model = None # your model here |
| fsdp_model = FSDP(model) |
| |
| # muon, in nature, cannot use 1-d tensor |
| # we provide helper function to group such tensors |
| # you can use your own function, if necessary |
| params = get_default_muon_param_groups(model) # user can write own is_muon_func, if necessary |
| |
| optim = optimizer.Muon( |
| params, |
| lr=0.01, |
| momentum=0.9, |
| weight_decay=1e-4, |
| ) |
| ``` |
|
|
| ## Documentation |
|
|
| - [Implementation Guide](./docs/implementation.md) β Detailed walkthrough of the internal architecture, parallel pipeline, distributed utilities, and QK clipping. Recommended for code reviewers and new contributors. |
| - [PyTorch 2.10 TP Fix](./docs/pytorch-2.10-tp-fix.md) β Root cause analysis and fixes for `_StridedShard` compatibility with PyTorch 2.10+. |
|
|
| ## Test |
|
|
| - Check [test/README.md](./test/README.md) for how to run the tests. |
|
|
| ## Pre-commit Hooks |
|
|
| This project uses [pre-commit](https://pre-commit.com/) to automatically check and format code before commits. |
|
|
| ### Setup |
|
|
| 1. Install pre-commit: |
|
|
| ```bash |
| pip install pre-commit |
| ``` |
|
|
| 2. Install the git hooks: |
| |
| ```bash |
| pre-commit install |
| ``` |
|
|
| Once installed, the configured hooks will run automatically on each commit. |
|
|
| ### Included Hooks |
|
|
| The following tools are run via pre-commit: |
|
|
| - **[yapf](https://github.com/google/yapf)** β Python code formatter |
| - **[typos](https://github.com/crate-ci/typos)** β Spell checker for common typos |
| - **[isort](https://github.com/PyCQA/isort)** β Organizes and sorts Python imports |
| - **[clang-format](https://clang.llvm.org/docs/ClangFormat.html)** β Formats C++/CUDA code (`--style=file`) |
| - **[pymarkdown](https://github.com/jackdewinter/pymarkdown)** β Lints and auto-fixes Markdown files |
| - **[actionlint](https://github.com/rhysd/actionlint)** β Validates GitHub Actions workflows |
|
|
| ### Usage |
|
|
| - Run all checks on the entire codebase: |
|
|
| ```bash |
| pre-commit run --all-files |
| ``` |
|
|
| - Run a specific hook (example: isort): |
| |
| ```bash |
| pre-commit run isort --all-files |
| ``` |
|
|
| ### Test |
|
|
| - There is a [simple unittest for Parallel Muon](./test/test_muon/README.md) |
|
|