AlphaRoPE:A Simple Yet Effective Length Extrapolation Method
This repo contains the code for the AlphaRoPE context window extension method.
Paper: AlphaRoPE:A Simple Yet Efficient Length Extrapolation Method
To reproduce, clone the repository and perform a local installation.
git clone https://github.com/<your-org>/AlphaRoPE.git
cd AlphaRoPE
pip install -e .
pip install -e ntk_yarn/Prepare tokenized data, then fine-tune with DeepSpeed. Run accelerate config first to enable DeepSpeed acceleration.
python dataset_download.py
python tokenization.py
python truncate.py
accelerate launch finetune.pyKey files: ntk_yarn/ (model implementations), finetune.py, tokenization.py, truncate.py.
python eval/pass_key.py
python eval/ppl_sliding_window.py
python eval/ppl.pyFor LongBench, clone THUDM/LongBench from GitHub (not included in this repo). Copy ntk_yarn/ into LongBench/LongBench/, register your checkpoint in config/model2path.json, then run:
python pred.py
python eval.pySee the LongBench README for details.