ALmonster/ChemGPT2-QA-72B

We fine-tuned our ChemGPT2-QA-72B based on the Qwen2-72B-Instruct model. Our training data, ChemGPT-2.0-Data, has been open-sourced and is available at https://huggingface.co/datasets/ALmonster/ChemGPT-2.0-Data. We evaluated our model on the three chemistry tasks of C-Eval and compared it with GPT-3.5 and GPT-4. The results are as follows:

C-Eval

Models	college_chemistry	high_school_chemistry	middle_school_chemistry	AVG
GPT-3.5	0.397	0.529	0.714	0.54666667
GPT4	0.594	0.558	0.811	0.65433333
chemgpt	0.71	0.936	0.995	0.88033333

Quickstart

Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained(
    "ALmonster/ChemGPT2-QA-72B",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("ALmonster/ChemGPT2-QA-72B")

prompt = "Give me a short introduction to large language model."
messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)

generated_ids = model.generate(
    model_inputs.input_ids,
    max_new_tokens=512
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

VLLM

We recommend deploying our model using 4 A100 GPUs. You can run the vllm server-side with the following code in terminal:

python -m vllm.entrypoints.openai.api_server --served-model-name chemgpt --model path/to/chemgpt --gpu-memory-utilization 0.98 --tensor-parallel-size 4 --port 6000

Then, you can use the following code to deploy client-side:

import requests
import json

def general_chemgpt_stream(inputs,history):
    url = 'http://loaclhost:6000/v1/chat/completions'

    history+=[{"role": "user", "content": inputs},]

    data = {
        "model": "chemgpt",
        "messages": history,
    }

    headers = {
        'Content-Type': 'application/json'
    }

    response = requests.post(url, headers=headers, data=json.dumps(data))

    headers = {"User-Agent": "vLLM Client"}

    pload = {
        "model": "chemgpt",
        "stream": True,
        "messages": history
    }
    response = requests.post(url,
                             headers=headers,
                             json=pload,
                             stream=True)

    for chunk in response.iter_lines(chunk_size=1,
                                     decode_unicode=False,
                                     delimiter=b"\n"):
        if chunk:
            string_data = chunk.decode("utf-8")
            try:
                json_data = json.loads(string_data[6:])
                delta_content = json_data["choices"][0]["delta"]["content"]
                assistant_reply+=delta_content
                yield delta_content
            except KeyError as e:
                delta_content = json_data["choices"][0]["delta"]["role"]
            except json.JSONDecodeError as e:
                history+=[{
                        "role": "assistant",
                        "content": assistant_reply,
                        "tool_calls": []
                    },]
                delta_content='[DONE]'
                assert '[DONE]'==chunk.decode("utf-8")[6:]

inputs='介绍一下NaoH'
history_chem=[]
for response_text in general_chemgpt_stream(inputs,history_chem):
    print(response_text,end='')