Model Card for climategpt/climategpt-70b

This model is the 70B parameter variant of the ClimateGPT model release.
Starting from Llama2 70B weights, the model undergoes continued pretraining and instruction finetuning on climate data.
The model is capable of answering questions and following instructions, especially tailored for the climate domain.

Overview

Developed by: AppTek, Eqtylab, Erasmus AI
Model type: decoder-only Transformer
Language(s) (NLP): natively supported: English; supported via cascaded MT on web interface: Arabic, Bangla, Chinese (simplified), Dutch, Finnougoric, French, Germanic, Greek, Hebrew, Indonesian, Japenese, Korean, Lithuanian, Pashto, Persian, Portuguese, Russian, Spanish, Thai, Turkish, Vietnamese,
License: TO BE ADDED
Finetuned from model: Llama2 70B
Repository: https://huggingface.co/climategpt/climategpt-70b
Paper: TO BE ADDED
Demo: TO BE ADDED

Uses

This model is intended to be directly used as a question answering model that is specialized in the climate domain.
The model is aimed at providing useful feedback for decision makers, scientists and jounalists involved in climate discussions.
The model can also be used as a starting point for interested developers for further finetuning.
The model is NOT intended to be a general-purpose chatbot (although it has chat capabilities).
For the full system including cascaded MT, RAG, etc., we recommend the user to go to our demo website: TO BE ADDED.
For hands-on finetuning deployment and inference, we recommend the user to directly use the Huggingface helpers.
For in-depth model conversion and finetuning, we recommend the user to use https://github.com/epfLLM/Megatron-LLM/.
Despite the efforts from the development team to elimite them, as every other chat-capable LLMs, this model may generate biased, offensive, inaccurate responses.

How to Get Started with the Model

After downloading the HF formatted model, the HF helpers should work out-of-the-box. For example, it is possible to evaluate the model with https://github.com/EleutherAI/lm-evaluation-harness by plugging in the model identifier --model_args pretrained=climategpt/climategpt-70b.

Training

For the Llama2 training data, we refer the user to https://huggingface.co/meta-llama/Llama-2-70b-chat-hf.
For continued pretraining, 4.2B climate domain tokens (tokenized by the Llama tokenizer) are used.
For instruction finetuning, about 579K instruction-completion pairs (both in the climate domain but also general domain) are used.

Evaluation

Automatic evaluation is done via https://github.com/EleutherAI/lm-evaluation-harness, into which we also implemented custom evaluation tasks. TO BE ADDED We also perform human evaluation with experts in the climate domain. TO BE ADDED

Environmental Impact

Hardware Type: H100
Hours used: 2300 hrs
Cloud Provider: TO BE ADDED
Compute Region: TO BE ADDED
Carbon Emitted: TO BE ADDED

Citation

BibTeX: TO BE ADDED