7 contributors

History: 22 commits

Alex Birch

gradient checkpointing for multi-query attention

07e555c unverified over 1 year ago

.gitattributes

1.48 kB

initial commit over 1 year ago
README.md

7.22 kB

Update README.md over 1 year ago
adapt_tokenizer.py

1.75 kB

Upload folder using huggingface_hub over 1 year ago
attention.py

23.7 kB

gradient checkpointing for multi-query attention over 1 year ago
blocks.py

2.65 kB

add support for AutoModelForCausalLM#from_pretrained()'s device_map='auto'. support gradient checkpointing, probably. add lots of type hints so I could understand what's going on. multiline long method signatures/calls (for easier comparison between checkpointed/non-checkpointed variants, and because these lines got even longer when I added type hints). make MPTForCausalLM#forward accept additional kwargs, since PeftModelForCausalLM#forward tries to send it an argument inputs_embeds=None, which it didn't like too much. over 1 year ago
config.json

1.23 kB

Upload folder using huggingface_hub over 1 year ago
configuration_mpt.py

9.08 kB

Upload folder using huggingface_hub over 1 year ago
flash_attn_triton.py

28.2 kB

add flash_attn_triton.py (#9) over 1 year ago
generation_config.json

91 Bytes

Upload folder using huggingface_hub over 1 year ago
hf_prefixlm_converter.py

27.2 kB

Upload folder using huggingface_hub over 1 year ago
is_torch_version.py

2.39 kB

add support for AutoModelForCausalLM#from_pretrained()'s device_map='auto'. support gradient checkpointing, probably. add lots of type hints so I could understand what's going on. multiline long method signatures/calls (for easier comparison between checkpointed/non-checkpointed variants, and because these lines got even longer when I added type hints). make MPTForCausalLM#forward accept additional kwargs, since PeftModelForCausalLM#forward tries to send it an argument inputs_embeds=None, which it didn't like too much. over 1 year ago
meta_init_context.py

3.64 kB

Upload folder using huggingface_hub over 1 year ago
modeling_mpt.py

20.1 kB

apply gradient checkpointing to Attention blocks over 1 year ago
norm.py

2.56 kB

Upload folder using huggingface_hub over 1 year ago
param_init_fns.py

12.6 kB

Upload folder using huggingface_hub over 1 year ago
pytorch_model-00001-of-00002.bin
Detected Pickle imports (3)
- "torch.BFloat16Storage",
- "torch._utils._rebuild_tensor_v2",
- "collections.OrderedDict"
What is a pickle import?
9.94 GB
LFS

Upload folder using huggingface_hub over 1 year ago
pytorch_model-00002-of-00002.bin
Detected Pickle imports (3)
- "torch._utils._rebuild_tensor_v2",
- "torch.BFloat16Storage",
- "collections.OrderedDict"
What is a pickle import?
3.36 GB
LFS

Upload folder using huggingface_hub over 1 year ago
pytorch_model.bin.index.json

16 kB

Upload folder using huggingface_hub over 1 year ago
special_tokens_map.json

174 Bytes

Upload folder using huggingface_hub over 1 year ago
tokenizer.json

2.11 MB

Upload folder using huggingface_hub over 1 year ago
tokenizer_config.json

237 Bytes

Upload folder using huggingface_hub over 1 year ago

Detected Pickle imports (3)

Detected Pickle imports (3)