jacobfulano commited on
Commit
e27b4b2
1 Parent(s): ee3acd5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -79,7 +79,7 @@ Note: This model requires that `trust_remote_code=True` be passed to the `from_p
79
  This is because we use a custom `MPT` model architecture that is not yet part of the Hugging Face `transformers` package.
80
  `MPT` includes options for many training efficiency features such as [FlashAttention](https://arxiv.org/pdf/2205.14135.pdf), [ALiBi](https://arxiv.org/abs/2108.12409), [QK LayerNorm](https://arxiv.org/abs/2010.04245), and more.
81
 
82
- To use the optimized [triton implementation](https://github.com/openai/triton) of FlashAttention (`pip install flash_attn`), you can load the model with `attn_impl='triton'` and move the model to `bfloat16`:
83
  ```python
84
  config = transformers.AutoConfig.from_pretrained('mosaicml/mpt-7b', trust_remote_code=True)
85
  config.attn_config['attn_impl'] = 'triton'
 
79
  This is because we use a custom `MPT` model architecture that is not yet part of the Hugging Face `transformers` package.
80
  `MPT` includes options for many training efficiency features such as [FlashAttention](https://arxiv.org/pdf/2205.14135.pdf), [ALiBi](https://arxiv.org/abs/2108.12409), [QK LayerNorm](https://arxiv.org/abs/2010.04245), and more.
81
 
82
+ To use the optimized [triton implementation](https://github.com/openai/triton) of FlashAttention, you can load the model with `attn_impl='triton'` and move the model to `bfloat16`:
83
  ```python
84
  config = transformers.AutoConfig.from_pretrained('mosaicml/mpt-7b', trust_remote_code=True)
85
  config.attn_config['attn_impl'] = 'triton'