Skip to content

ZNLP/AlphaRoPE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

AlphaRoPE

AlphaRoPE:A Simple Yet Effective Length Extrapolation Method

This repo contains the code for the AlphaRoPE context window extension method.

Paper

Paper: AlphaRoPE:A Simple Yet Efficient Length Extrapolation Method

Reproduction

To reproduce, clone the repository and perform a local installation.

git clone https://github.com/<your-org>/AlphaRoPE.git
cd AlphaRoPE
pip install -e .
pip install -e ntk_yarn/

Training

Prepare tokenized data, then fine-tune with DeepSpeed. Run accelerate config first to enable DeepSpeed acceleration.

python dataset_download.py
python tokenization.py
python truncate.py
accelerate launch finetune.py

Key files: ntk_yarn/ (model implementations), finetune.py, tokenization.py, truncate.py.

Evaluation

python eval/pass_key.py
python eval/ppl_sliding_window.py
python eval/ppl.py

For LongBench, clone THUDM/LongBench from GitHub (not included in this repo). Copy ntk_yarn/ into LongBench/LongBench/, register your checkpoint in config/model2path.json, then run:

python pred.py
python eval.py

See the LongBench README for details.

About

AlphaRoPE:A Simple Yet Effective Length Extrapolation Method

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages