hkiyomaru commited on
Commit
37309df
1 Parent(s): 1710951

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -136,6 +136,9 @@ The models have been fine-tuned on the following datasets.
136
 
137
  You can view the evaluation results of several LLMs on this [leaderboard](http://wandb.me/llm-jp-leaderboard). We used [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval) (v1.3.0) for the evaluation.
138
 
 
 
 
139
  ## Risks and Limitations
140
 
141
  The models released here are still in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.
 
136
 
137
  You can view the evaluation results of several LLMs on this [leaderboard](http://wandb.me/llm-jp-leaderboard). We used [llm-jp-eval](https://github.com/llm-jp/llm-jp-eval) (v1.3.0) for the evaluation.
138
 
139
+ Besides, we used LLM-as-a-judge frameworks, [Japanese Vicuna QA Benchmark](https://github.com/ku-nlp/ja-vicuna-qa-benchmark/) and [Japanese MT Bench](https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge), for evaluation.
140
+ For details, please refer to [our technical blog](https://llm-jp.nii.ac.jp/blog/2024/04/30/v2.0-release.html) (in Japanese).
141
+
142
  ## Risks and Limitations
143
 
144
  The models released here are still in the early stages of our research and development and have not been tuned to ensure outputs align with human intent and safety considerations.