File size: 3,597 Bytes
62315fe
 
 
540974e
 
3366929
b7c8b6e
 
3366929
b7c8b6e
62315fe
540974e
62315fe
d7f3593
 
540974e
d7f3593
e299fff
 
540974e
e299fff
62315fe
 
 
 
 
 
 
 
 
 
 
 
540974e
62315fe
540974e
62315fe
540974e
62315fe
 
 
 
 
 
 
 
540974e
e299fff
 
 
 
 
 
 
 
 
40ceb07
e299fff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ce9401
e299fff
6ce9401
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
---
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
language:
- en
tags:
- KALE-LM
- science
- chemistry
pipeline_tag: text-generation
---
# Llama3-KALE-LM-Chem-8B

## Introduction

We are thrilled to present Llama3-KALE-LM-Chem 8B, our first open-source KALE-LM, which specializes in chemistry.

## Training Details

We have continually pre-trained the model with a large amount of data and post-trained it through supervised fine-tuning.

## Benchmarks

### Open Benchmarks
| Models | ChemBench | MMLU | MMLU-Chem | SciQ | IE(Acc) | IE(LS) |
| ---- | ---- | ---- | ---- | ---- | ---- | ---- |
| GPT-3.5 | 47.15 | 69.75 | 53.32 | 89.6 | 52.98 | 68.28 |
| GPT-4 | 53.72 | 78.67 | 63.70 | 94.10 | 54.20 | 69.74 |
| Llama3-8B-Instruct | 46.02 | 68.3 | 51.10 | 93.30 | 45.83 | 61.22 |
| LlaSMol | 28.47 | 54.47 | 33.24 | 72.30 | 2.16 | 3.23 |
| ChemDFM | 44.44 | 58.11 | 45.60 | 86.70 | 7.61 | 11.49 |
| ChemLLM-7B-Chat | 34.16 | 61.79 | 48.39 | 94.00 | 29.66 | 39.17 |
| ChemLLM-7B-Chat-1.5-SFT | 42.75 | 63.56 | 49.63 | **95.10** | 14.96 | 19.61 |
| **Llama3-KALE-LM-Chem-8B** | **52.40** | **68.74** | **53.83** | 91.50 | **67.50** | **78.37** |

#### ChemBench Details (Evaluated By OpenCompass)

| Models | NC | PP | M2C | C2M | PP | RS | YP | TP | SP | Average |
| ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ | ------ |
| GPT-3.5 | 46.93 | 56.98 | 85.28 | 38.25 | 43.67 | 42.33 | 30.33 | 42.57 | 38 | 47.15 |
| GPT-4 | 54.82 | 65.02 | 92.64 | 52.88 | 62.67 | 52.67 | 42.33 | 24.75 | 35.67 | 53.72 |
| Llama3-8B-Instruct | 51.31 | 27.79 | 90.30 | 40.88 | 34.00 | 30.00 | 45.33 | 60.89 | 33.67 | 46.02 |
| LlaSMol | 27.78 | 29.34 | 31.44 | 23.38 | 25.67 | 24.00 | 37.33 | 34.65 | 22.67 | 28.47 |
| ChemDFM | 36.92 | 55.57 | 83.95 | 42.00 | 40.00 | 37.33 | 39.00 | 33.17 | 32.00 | 44.44 |
| ChemLLM-7B-Chat | 41.05 | 29.76 | 85.28 | 26.12 | 26.00 | 24.00 | 20.00 | 24.26 | 31.00 | 34.16 |
| ChemLLM-7B-Chat-1.5-SFT | 50.06 | 49.51 | 85.28 | 38.75 | 38.00 | 26.67 | 28.33 | 31.68 | 33.67 | 42.44 |
| Llama3-KALE-LM-Chem-8B | 63.58 | 58.39 | 92.98 | 44.50 | 48.67 | 38.33 | 46.33 | 44.55 | 34.33 | 52.41 |

## Quick Start 

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("USTC-KnowledgeComputingLab/Llama3-KALE-LM-Chem-8B")

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=2048
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```

## Cite This Work

```
@article{dai2024kale,
  title={KALE-LM: Unleash The Power Of AI For Science Via Knowledge And Logic Enhanced Large Model},
  author={Dai, Weichen and Chen, Yezeng and Dai, Zijie and Huang, Zhijie and Liu, Yubo and Pan, Yixuan and Song, Baiyang and Zhong, Chengli and Li, Xinhe and Wang, Zeyu and others},
  journal={arXiv preprint arXiv:2409.18695},
  year={2024}
}
```