File size: 2,381 Bytes
6f63316 305d238 2378497 6f63316 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
---
license: bsd
---
Welcome to Qwen2-72B-Instruct-math model, which is used for solving Math Problem.
<div align="center">
<h1>Welcome to LLM Math Solver</h1>
<h4 align="center">
<a href="https://percent4.github.io/llm_math_solver/"><img src="https://img.shields.io/badge/📄-docs-000000?style=for-the-badge&colorA=09c&colorB=555" height='35px' alt="Docs"></a>
</h4>
<p>LLM Math Solver: using LLM to solve MATH problems.
</p>
<h1></h1>
</div>
## 评估结果
不同模型经过微调的数学能力测评表如下:
| 基座模型 | GSM8K | MATH | 样本数 |
|---------------------|--------|--------|------|
| QWen1.5-32B | 79.68% | 43.58% | 2402 |
| Yi-1.5-34B | 83.47% | 52.76% | 3480 |
| Yi-1.5-34B-Chat | 85.67% | 57.22% | 3479 |
| QWen-2-72B-Instruct | **93.03%** | **68.54%** | 3469 |
其它模型:
|模型|GSM8K | MATH|
|---|---|---|
|GPT-4o-0513|95.8%|76.6%|
|Claude-3.5-Sonnet|96.4%|71.1%|
|GEMINI-1.5-PRO(May 2024)|/|67.7%|
|DeepSeek-Coder-V2-Instruct(236B)|94.9%|75.7%|
## 使用方法
## 参考文献
关于该模型使用的训练数据、训练方法和相关文章,可以参考Github上项目: [llm_math_solver](https://github.com/percent4/llm_math_solver).
文章如下:
1. [NLP(九十七)大模型数学解题能力的初步探索](https://mp.weixin.qq.com/s?__biz=MzU2NTYyMDk5MQ==&mid=2247486824&idx=1&sn=fd6b36cf78aead227359606a7270516d&chksm=fcb9b4f8cbce3dee332335092f576c703ccdc55598cf45cb7f483f822ba5c72590019384d12a&token=321761101&lang=zh_CN#rd)
2. [NLP(九十九)大模型的数学能力微调及测评](https://mp.weixin.qq.com/s?__biz=MzU2NTYyMDk5MQ==&mid=2247486889&idx=1&sn=27c1a40d3af462f43a80a1ed401843f6&chksm=fcb9b439cbce3d2fd73e753618e0b32027314648eb13dc8b48bb9e713ad5313777c1ef27ce46&token=390124673&lang=zh_CN#rd)
3. [NLP(一百)大模型数学能力测评](https://mp.weixin.qq.com/s?__biz=MzU2NTYyMDk5MQ==&mid=2247486909&idx=1&sn=31b01bd4155b2c9ca15e2a7ae9f4de15&chksm=fcb9b42dcbce3d3bb473cf138f0f0f9a71addeff934900d155b6b90fb2a5857c1926b8aa0e9d&token=584142844&lang=zh_CN#rd)
4. [Open WebUI的Pipelines学习之使用大模型解数学题](https://mp.weixin.qq.com/s?__biz=MzU2NTYyMDk5MQ==&mid=2247487013&idx=1&sn=6a6786ba8c8c7cfdbc02ef558adefe71&chksm=fcb9b7b5cbce3ea37f8fb61e743d0ea0a7d4f5d6b8e8b2c7a80171a5c8c217524d8f307c0146&token=120899150&lang=zh_CN#rd) |