stockmark
/

stockmark-100b

Text Generation

text-generation-inference

Model card Files Files and versions

omitakahiro commited on May 15, 2024

Commit

c4e5ef1

·

verified ·

1 Parent(s): 10cf9e9

Update README.md

Files changed (1) hide show

README.md +37 -0

README.md CHANGED Viewed

@@ -62,6 +62,43 @@ English data is sampled from [RedPajama-Data](https://github.com/togethercompute
 - Container: [Pytorch NGC Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
 - Library: [Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
 ## License
 [MIT](https://opensource.org/licenses/MIT)

 - Container: [Pytorch NGC Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
 - Library: [Megatron-LM](https://github.com/NVIDIA/Megatron-LM)
+## Performance
+**Stockmark Business Questions**
+Dataset: https://huggingface.co/datasets/stockmark/business-questions
+| model | accuracy |
+|:---:|:---:|
+|stockmark-100b-instruct| 0.90 |
+|stockmark-13b-instruct| 0.80 |
+|GPT-3.5-turbo[^1]| 0.42 |
+[^1]: 0613
+**Japanese Vicuna QA Benchmark**
+We exclud categories that require calculation and coding, and use remaining 60 questions for evaluation.
+GitHub: https://github.com/ku-nlp/ja-vicuna-qa-benchmark
+| model | average score |
+|:---:|:---:|
+|stockmark-100b-instruct| 5.97 |
+|tokyotech-llm/Swallow-70b-instruct-hf| 5.59 |
+|GPT-3.5 (text-davinci-003)| 5.08 |
+**Inference speed**
+| model | time [s] for genrating 100 characters in Japanese |
+|:---:|:---:|
+|stockmark-100b-instruct| 1.86 |
+| gpt-3.5-turbo | 2.15 |
+| gpt-4-turbo | 5.48 |
+|tokyotech-llm/Swallow-70b-instruct-hf| 2.22 |
+For local LLMs, we measured the inference time using AWS Inferentia2.
 ## License
 [MIT](https://opensource.org/licenses/MIT)