openthaigpt
/

openthaigpt1.5-7b-instruct

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kobkrit commited on Sep 30

Commit

49fd75c

•

1 Parent(s): dcd8202

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -88,7 +88,7 @@ Thai language multiple choice exams, Test on unseen test set, Zero-shot learning
 - E-mail: kobkrit@aieat.or.th
 ## Prompt Format
-Prompt format is based on Llama2 with a small modification (Adding "###" to specify the context part)
 ```
 <|im_start|>system\n{sytem_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n
 ```
@@ -175,6 +175,8 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
 ```bash
 vllm serve openthaigpt/openthaigpt1.5-72b-instruct --tensor-parallel-size 4
 ```
 3. Run inference (CURL example)
 ```bash
 curl -X POST 'http://127.0.0.1:8000/v1/completions' \

 - E-mail: kobkrit@aieat.or.th
 ## Prompt Format
+Prompt format is based on ChatML.
 ```
 <|im_start|>system\n{sytem_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n
 ```
 ```bash
 vllm serve openthaigpt/openthaigpt1.5-72b-instruct --tensor-parallel-size 4
 ```
+* Note, change ``--tensor-parallel-size 4`` to the amount of available GPU cards.
 3. Run inference (CURL example)
 ```bash
 curl -X POST 'http://127.0.0.1:8000/v1/completions' \