the gradient_checkpointing option is always False in BigBirdPegasusEncoder, BigBirdPegasusDecoder class

#9
by hseom - opened

Thanks for this great work!

I found this: in BigBirdPegasusEncoder, BigBirdPegasusDecoder class, the gradient_checkpointing option is always False so the GPU memory is accumlated.
please make it optional again :)

# modling_bigbird_pegasus.py, line 1768~
# BigBridPegasusEncoder class
...
...
self.layers = nn.ModuleList([BigBirdPegasusEncoderLayer(config, seed=i) for i in range(config.encoder_layers)])
self.layernorm_embedding = nn.LayerNorm(embed_dim)
self.gradient_checkpointing = False

Sign up or log in to comment