H100 TransformerEngine

#14
by SinanAkkoyun - opened

Thank you so much for your awesome work!

Please implement the H100 FP8 faster inference for this model!

When do you suspect will this be done? Thank you!

Hi @SinanAkkoyun , we are working on it but it will likely be ~months away. H100s are only just entering the market and theres a lot of performance tuning to do!

abhi-mosaic changed discussion status to closed

@abhi-mosaic LambdaLabs supplies "infinite" H100s now! When do you think will the TE implementation be available? Can I somehow help?

SinanAkkoyun changed discussion status to open

The model should work as-is on H100s with BF16.

FP8 support is gonna be a bit trickier but we are working on it: https://github.com/mosaicml/llm-foundry/pull/271

sam-mosaic changed discussion status to closed

Sign up or log in to comment