EasyContext
E
Easycontext
Overview :
EasyContext is an open-source project aimed at enabling the training of language models with a 1 million-word context length using ordinary hardware. It primarily utilizes techniques such as sequence parallelism, DeepSpeed Zero3 offloading, Flash Attention, and activation checkpointing. Rather than proposing novel innovations, the project showcases how to combine existing tools to achieve this goal. It has successfully trained two models, Llama-2-7B and Llama-2-13B, achieving 700K and 1M word context lengths respectively on 8 A100 and 16 A100 GPUs.
Target Users :
For training language models with extra-long contexts
Total Visits: 474.6M
Top Region: US(19.34%)
Website Views : 51.6K
Use Cases
Training the Llama-2-7B model on 8 A100 GPUs using EasyContext, achieving a context length of 700K words.
Training the Llama-2-13B model on 16 A100 GPUs using EasyContext, achieving a context length of 1M words.
Combining existing technologies, EasyContext significantly enhances the context length of language models, laying the foundation for applications such as video generation.
Features
Sequence Parallelism
DeepSpeed Zero3 Offloading
Flash Attention and Fusion Cross-Entropy Core
Activation Checkpoint
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase