shinnosukeono commited on
Commit
21c1e36
·
verified ·
1 Parent(s): fffa5ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -12
README.md CHANGED
@@ -75,18 +75,14 @@ Compared to Meditron3-Qwen2.5-7B and Llama3.1-Swallow-8B-Instruct-v0.3, JPharmat
75
  **BibTeX:**
76
 
77
  ```
78
- @misc{sukeda_japanese_2025,
79
- title = {A {Japanese} {Language} {Model} and {Three} {New} {Evaluation} {Benchmarks} for {Pharmaceutical} {NLP}},
80
- url = {http://arxiv.org/abs/2505.16661},
81
- doi = {10.48550/arXiv.2505.16661},
82
- abstract = {We present a Japanese domain-specific language model for the pharmaceutical field, developed through continual pretraining on 2 billion Japanese pharmaceutical tokens and 8 billion English biomedical tokens. To enable rigorous evaluation, we introduce three new benchmarks: YakugakuQA, based on national pharmacist licensing exams; NayoseQA, which tests cross-lingual synonym and terminology normalization; and SogoCheck, a novel task designed to assess consistency reasoning between paired statements. We evaluate our model against both open-source medical LLMs and commercial models, including GPT-4o. Results show that our domain-specific model outperforms existing open models and achieves competitive performance with commercial ones, particularly on terminology-heavy and knowledge-based tasks. Interestingly, even GPT-4o performs poorly on SogoCheck, suggesting that cross-sentence consistency reasoning remains an open challenge. Our benchmark suite offers a broader diagnostic lens for pharmaceutical NLP, covering factual recall, lexical variation, and logical consistency. This work demonstrates the feasibility of building practical, secure, and cost-effective language models for Japanese domain-specific applications, and provides reusable evaluation resources for future research in pharmaceutical and healthcare NLP. Our model, codes, and datasets are released at https://github.com/EQUES-Inc/pharma-LLM-eval.},
83
- urldate = {2025-05-30},
84
- publisher = {arXiv},
85
- author = {Sukeda, Issey and Fujii, Takuro and Buma, Kosei and Sasaki, Shunsuke and Ono, Shinnosuke},
86
- month = may,
87
- year = {2025},
88
- note = {arXiv:2505.16661 [cs]},
89
- annote = {Comment: 15 pages, 9 tables, 5 figures}
90
  }
91
 
92
  ```
 
75
  **BibTeX:**
76
 
77
  ```
78
+ @misc{ono2025japaneselanguagemodelnew,
79
+ title={A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP},
80
+ author={Shinnosuke Ono and Issey Sukeda and Takuro Fujii and Kosei Buma and Shunsuke Sasaki},
81
+ year={2025},
82
+ eprint={2505.16661},
83
+ archivePrefix={arXiv},
84
+ primaryClass={cs.CL},
85
+ url={https://arxiv.org/abs/2505.16661},
 
 
 
 
86
  }
87
 
88
  ```