haoranli-ml/Llama-3-8B-RoPE-64k-Base

Paper GitHub

✨ Overview

CoPE is a plug-and-play enhancement of RoPE that softly clips the unstable low-frequency components, delivering consistent gains both within the training context and during long-context extrapoaltion.

With a simple yet effective soft clipping strategy, CoPE:

1️⃣ Eliminates severe OOD outliers, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.

2️⃣ Refines Long-range Semantic Signals by alleviating the secret long-term decay of semantic attention introduced by RoPE.

3️⃣ Prevents Spectral Leakage induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.

For more details on training and evaluation, please refer to the official GitHub repository.

πŸ“– Citation

@article{li2026cope,
  title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs},
  author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng},
  journal={arXiv preprint arXiv:2602.05258},
  year={2026}
}
Downloads last month
19
Safetensors
Model size
8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for haoranli-ml/Llama-3-8B-RoPE-64k-Base

Finetuned
(513)
this model

Collection including haoranli-ml/Llama-3-8B-RoPE-64k-Base

Paper for haoranli-ml/Llama-3-8B-RoPE-64k-Base