Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
bird-of-paradise
/
deepseek-mla
like
5
Text Generation
Transformers
PyTorch
English
deepseek-mla
attention-mechanism
mla
efficient-attention
arxiv:
2405.04434
License:
mit
Model card
Files
Files and versions
Community
1
Use this model
New discussion
New pull request
Resources
PR & discussions documentation
Code of Conduct
Hub documentation
All
Discussions
Pull requests
View closed (0)
Issue in mla.py
#1 opened 8 days ago by
NingXu24