Commit History

Fixes any potential overflow when calculating attention weights.
b5c5161

gugarosa commited on

Delete modeling_mixformer_sequential.py
470e18a

gugarosa commited on

Delete configuration_mixformer_sequential.py
bd98e4e

gugarosa commited on

Upload pytorch_model.bin
34b22f4

gugarosa commited on

Update to new model interface.
bbace88

gugarosa commited on

Improves type hinting on configuration arguments.
8d2c4ce

gugarosa commited on

Fixes flash-attn import with a try/except statement
9ed5987

gugarosa commited on

Adds support for flash-attn rotary embedding and fused dense layers.
90c38d9

gugarosa commited on

Adds support for MQA/GQA and attention mask during training / fine-tuning.
371fd51

gugarosa commited on

Upload modeling_mixformer_sequential.py
633bca1

gugarosa commited on

Upload README.md
769684a

gugarosa commited on

fix(phi-1): Checks length of `attention_mask`if it is passed as direct tensor.
1f890f7

gugarosa commited on

Support for `attention_mask` in forward pass.
d22f35e

gugarosa commited on

Upload MixFormerSequentialForCausalLM
44cca9f

suriyagunasekar commited on

Upload MixFormerSequentialForCausalLM
e96b200

suriyagunasekar commited on

Upload MixFormerSequentialForCausalLM
0f4ae0e

suriyagunasekar commited on