Text Generation
Transformers
Safetensors
llama
4-bit precision
AWQ
Inference Endpoints
conversational
text-generation-inference
llama-3-neural-chat-v1-8b-AWQ / generation_config.json
Ubuntu
adding AWQ model
9d58ca5
{
"_from_model_config": true,
"bos_token_id": 128000,
"do_sample": true,
"eos_token_id": 128001,
"transformers_version": "4.38.2"
}