|
--- |
|
title: Zero Bubble Pipeline Parallellism |
|
emoji: π |
|
colorFrom: indigo |
|
colorTo: red |
|
sdk: gradio |
|
sdk_version: 4.36.1 |
|
app_file: app.py |
|
pinned: false |
|
license: apache-2.0 |
|
--- |
|
|
|
|
|
# Zero Bubble Pipeline Parallelism |
|
|
|
Zero Bubble Pipeline Parallelism is a novel pipeline parallelism algorithm able to reduce the bubble of pipeline parallelism to almost zero while preserving synchronous semantics. |
|
|
|
Check out our paper at: |
|
* [Arxiv Version with ZBV](https://arxiv.org/abs/2401.10241) |
|
* [ICLR Accepted version with ZB1P and ZB2P](https://openreview.net/pdf?id=tuzTN0eIO5) |
|
|
|
Try out our implementation based on Megatron on [https://github.com/sail-sg/zero-bubble-pipeline-parallelism](https://github.com/sail-sg/zero-bubble-pipeline-parallelism) |
|
|
|
Experiments shows zero bubble pipeline parallelism can accelerate training up to 30% with a similar memory comsumption. A detailed table of experiments is coming soon. |