Text Generation
Transformers
Safetensors
4 languages
cohere
conversational
Inference Endpoints
text-generation-inference

After setting prompt, each sentence will end with <|im_end|>

#2
by online2311 - opened

Is there a problem with chat_template?

I use vllm for inference

docker run -d --name vllm --runtime nvidia --gpus all -v ./huggingface:/root/.cache/huggingface --ipc=host --env "NCCL_P2P_DISABLE=1" -p 8000:8000 --restart=always  vllm/vllm-openai:latest --model CausalLM/35b-beta-long --tensor-parallel-size=8 --enforce-eager

Sign up or log in to comment