Hugging Face
Models
Datasets
Spaces
Docs
Solutions
Pricing
Log In
Sign Up
Trelis
/
Llama-2-7b-chat-hf-hosted-inference-8bit
like
2
Text Generation
Transformers
PyTorch
Safetensors
English
llama
facebook
meta
llama-2
hosted inference
8 bit
8bit
8-bit precision
Inference Endpoints
text-generation-inference
arxiv:
2307.09288
Model card
Files
Files and versions
Community
3
Train
Deploy
Use in Transformers
New discussion
New pull request
Resources
PR & discussions documentation
Code of Conduct
Hub documentation
All
Discussions
Pull requests
View closed (2)
Quantization method?
#3 opened about 2 hours ago by
monology