Refactor pipeline to async generator pattern (#16)

33929c0 unverified about 2 months ago

3.06 kB

	---
	tags:
	- kernels
	license: apache-2.0
	---

	# Optimizer

	Optimizer is a python package that provides:
	- PyTorch implementation of recent optimizer algorithms
	- with support for parallelism techniques for efficient large-scale training.

	## Currently implemented
	- Parallel Muon with N-D sharding
	- [arxiv URL](https://arxiv.org/abs/2511.07464)
	- Supports general N-D sharding configurations
	- The implementation is not tied to any specific parallel strategy.
	- Verified from basic FSDP2 setups up to hybrid configurations such as
	(2 TP + 2 DP-Replicate + 2 DP-Shard).
	- Verified configurations can be found in [test_muon.py](./test/test_muon.py)

	## Usage

	```python
	import torch
	from torch.distributed.fsdp import FullyShardedDataParallel as FSDP
	from kernels import get_kernel

	optimizer = get_kernel("motif-technologies/optimizer")
	get_default_muon_param_groups = optimizer.muon.get_default_muon_param_groups

	model = None # your model here
	fsdp_model = FSDP(model)

	# muon, in nature, cannot use 1-d tensor
	# we provide helper function to group such tensors
	# you can use your own function, if necessary
	params = get_default_muon_param_groups(model) # user can write own is_muon_func, if necessary

	optim = optimizer.Muon(
	params,
	lr=0.01,
	momentum=0.9,
	weight_decay=1e-4,
	)
	```

	## Documentation

	- [Implementation Guide](./docs/implementation.md) — Detailed walkthrough of the internal architecture, parallel pipeline, distributed utilities, and QK clipping. Recommended for code reviewers and new contributors.
	- [PyTorch 2.10 TP Fix](./docs/pytorch-2.10-tp-fix.md) — Root cause analysis and fixes for `_StridedShard` compatibility with PyTorch 2.10+.

	## Test

	- Check [test/README.md](./test/README.md) for how to run the tests.

	## Pre-commit Hooks

	This project uses [pre-commit](https://pre-commit.com/) to automatically check and format code before commits.

	### Setup

	1. Install pre-commit:

	```bash
	pip install pre-commit
	```

	2. Install the git hooks:

	```bash
	pre-commit install
	```

	Once installed, the configured hooks will run automatically on each commit.

	### Included Hooks

	The following tools are run via pre-commit:

	- [yapf](https://github.com/google/yapf) – Python code formatter
	- [typos](https://github.com/crate-ci/typos) – Spell checker for common typos
	- [isort](https://github.com/PyCQA/isort) – Organizes and sorts Python imports
	- [clang-format](https://clang.llvm.org/docs/ClangFormat.html) – Formats C++/CUDA code (`--style=file`)
	- [pymarkdown](https://github.com/jackdewinter/pymarkdown) – Lints and auto-fixes Markdown files
	- [actionlint](https://github.com/rhysd/actionlint) – Validates GitHub Actions workflows

	### Usage

	- Run all checks on the entire codebase:

	```bash
	pre-commit run --all-files
	```

	- Run a specific hook (example: isort):

	```bash
	pre-commit run isort --all-files
	```

	### Test

	- There is a [simple unittest for Parallel Muon](./test/test_muon/README.md)