Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
Birchlabs
/
mosaicml-mpt-7b-chat-qlora
like
22
Text Generation
Transformers
PyTorch
5 datasets
mpt
Composer
MosaicML
llm-foundry
custom_code
text-generation-inference
arxiv:
2205.14135
arxiv:
2108.12409
arxiv:
2010.04245
License:
cc-by-nc-sa-4.0
Model card
Files
Files and versions
Community
1
Train
Deploy
Use this model
07e555c
mosaicml-mpt-7b-chat-qlora
7 contributors
History:
22 commits
Alex Birch
gradient checkpointing for multi-query attention
07e555c
unverified
over 1 year ago
.gitattributes
1.48 kB
initial commit
over 1 year ago
README.md
7.22 kB
Update README.md
over 1 year ago
adapt_tokenizer.py
1.75 kB
Upload folder using huggingface_hub
over 1 year ago
attention.py
23.7 kB
gradient checkpointing for multi-query attention
over 1 year ago
blocks.py
2.65 kB
add support for AutoModelForCausalLM#from_pretrained()'s device_map='auto'. support gradient checkpointing, probably. add lots of type hints so I could understand what's going on. multiline long method signatures/calls (for easier comparison between checkpointed/non-checkpointed variants, and because these lines got even longer when I added type hints). make MPTForCausalLM#forward accept additional kwargs, since PeftModelForCausalLM#forward tries to send it an argument inputs_embeds=None, which it didn't like too much.
over 1 year ago
config.json
1.23 kB
Upload folder using huggingface_hub
over 1 year ago
configuration_mpt.py
9.08 kB
Upload folder using huggingface_hub
over 1 year ago
flash_attn_triton.py
28.2 kB
add flash_attn_triton.py (#9)
over 1 year ago
generation_config.json
91 Bytes
Upload folder using huggingface_hub
over 1 year ago
hf_prefixlm_converter.py
27.2 kB
Upload folder using huggingface_hub
over 1 year ago
is_torch_version.py
2.39 kB
add support for AutoModelForCausalLM#from_pretrained()'s device_map='auto'. support gradient checkpointing, probably. add lots of type hints so I could understand what's going on. multiline long method signatures/calls (for easier comparison between checkpointed/non-checkpointed variants, and because these lines got even longer when I added type hints). make MPTForCausalLM#forward accept additional kwargs, since PeftModelForCausalLM#forward tries to send it an argument inputs_embeds=None, which it didn't like too much.
over 1 year ago
meta_init_context.py
3.64 kB
Upload folder using huggingface_hub
over 1 year ago
modeling_mpt.py
20.1 kB
apply gradient checkpointing to Attention blocks
over 1 year ago
norm.py
2.56 kB
Upload folder using huggingface_hub
over 1 year ago
param_init_fns.py
12.6 kB
Upload folder using huggingface_hub
over 1 year ago
pytorch_model-00001-of-00002.bin
pickle
Detected Pickle imports (3)
"torch.BFloat16Storage"
,
"torch._utils._rebuild_tensor_v2"
,
"collections.OrderedDict"
What is a pickle import?
9.94 GB
LFS
Upload folder using huggingface_hub
over 1 year ago
pytorch_model-00002-of-00002.bin
pickle
Detected Pickle imports (3)
"torch._utils._rebuild_tensor_v2"
,
"torch.BFloat16Storage"
,
"collections.OrderedDict"
What is a pickle import?
3.36 GB
LFS
Upload folder using huggingface_hub
over 1 year ago
pytorch_model.bin.index.json
16 kB
Upload folder using huggingface_hub
over 1 year ago
special_tokens_map.json
174 Bytes
Upload folder using huggingface_hub
over 1 year ago
tokenizer.json
2.11 MB
Upload folder using huggingface_hub
over 1 year ago
tokenizer_config.json
237 Bytes
Upload folder using huggingface_hub
over 1 year ago