OceanGPT-2B-v0.1 / README.md

Update README.md

ea3db82 verified 4 months ago

4.6 kB

	---
	license: mit
	pipeline_tag: text-generation
	tags:
	- ocean
	- text-generation-inference
	- oceangpt
	language:
	- en
	- zh
	datasets:
	- zjunlp/OceanInstruct
	---


	<div align="center">
	<img src="logo.jpg" width="300px">

	OceanGPT: A Large Language Model for Ocean Science Tasks

	<p align="center">
	<a href="https://github.com/zjunlp/OceanGPT">Project</a> •
	<a href="https://arxiv.org/abs/2310.02031">Paper</a> •
	<a href="https://huggingface.co/collections/zjunlp/oceangpt-664cc106358fdd9f09aa5157">Models</a> •
	<a href="http://oceangpt.zjukg.cn/">Web</a> •
	<a href="#quickstart">Quickstart</a> •
	<a href="#citation">Citation</a>
	</p>


	</div>

	OceanGPT-2B-v0.1 is based on MiniCPM-2B and has been trained on a bilingual dataset in the ocean domain, covering both Chinese and English.


	## ⏩Quickstart
	### Download the model

	Download the model: [OceanGPT-2B-v0.1](https://huggingface.co/zjunlp/OceanGPT-2B-v0.1)

	```shell
	git lfs install
	git clone https://huggingface.co/zjunlp/OceanGPT-2B-v0.1
	```
	or
	```
	huggingface-cli download --resume-download zjunlp/OceanGPT-2B-v0.1 --local-dir OceanGPT-2B-v0.1 --local-dir-use-symlinks False
	```
	### Inference

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch
	device = "cuda" # the device to load the model onto
	path = 'YOUR-MODEL-PATH'
	model = AutoModelForCausalLM.from_pretrained(
	path,
	torch_dtype=torch.bfloat16,
	device_map="auto"
	)
	tokenizer = AutoTokenizer.from_pretrained(path)

	prompt = "Which is the largest ocean in the world?"
	messages = [
	{"role": "system", "content": "You are a helpful assistant."},
	{"role": "user", "content": prompt}
	]
	text = tokenizer.apply_chat_template(
	messages,
	tokenize=False,
	add_generation_prompt=True
	)
	model_inputs = tokenizer([text], return_tensors="pt").to(device)

	generated_ids = model.generate(
	model_inputs.input_ids,
	max_new_tokens=512
	)
	generated_ids = [
	output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
	]

	response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
	```

	## 📌Models

	\| Model Name \| HuggingFace \| WiseModel \| ModelScope \|
	\|-------------------\|-----------------------------------------------------------------------------------\|----------------------------------------------------------------------------------------\|-----------------------------------------------------------------------------------------\|
	\| OceanGPT-14B-v0.1 (based on Qwen) \| <a href="https://huggingface.co/zjunlp/OceanGPT-14B-v0.1" target="_blank">14B</a> \| <a href="https://wisemodel.cn/models/zjunlp/OceanGPT-14B-v0.1" target="_blank">14B</a> \| <a href="https://modelscope.cn/models/ZJUNLP/OceanGPT-14B-v0.1" target="_blank">14B</a> \|
	\| OceanGPT-7B-v0.2 (based on Qwen) \| <a href="https://huggingface.co/zjunlp/OceanGPT-7b-v0.2" target="_blank">7B</a> \| <a href="https://wisemodel.cn/models/zjunlp/OceanGPT-7b-v0.2" target="_blank">7B</a> \| <a href="https://modelscope.cn/models/ZJUNLP/OceanGPT-7b-v0.2" target="_blank">7B</a> \|
	\| OceanGPT-2B-v0.1 (based on MiniCPM) \| <a href="https://huggingface.co/zjunlp/OceanGPT-2B-v0.1" target="_blank">2B</a> \| <a href="https://wisemodel.cn/models/zjunlp/OceanGPT-2b-v0.1" target="_blank">2B</a> \| <a href="https://modelscope.cn/models/ZJUNLP/OceanGPT-2B-v0.1" target="_blank">2B</a> \|


	## 🌻Acknowledgement

	OceanGPT is trained based on the open-sourced large language models including [Qwen](https://huggingface.co/Qwen), [MiniCPM](https://huggingface.co/collections/openbmb/minicpm-2b-65d48bf958302b9fd25b698f), [LLaMA](https://huggingface.co/meta-llama). Thanks for their great contributions!

	## Limitations

	- The model may have hallucination issues.

	- We did not optimize the identity and the model may generate identity information similar to that of Qwen/MiniCPM/LLaMA/GPT series models.

	- The model's output is influenced by prompt tokens, which may result in inconsistent results across multiple attempts.


	### 🚩Citation

	Please cite the following paper if you use OceanGPT in your work.

	```bibtex
	@article{bi2023oceangpt,
	title={OceanGPT: A Large Language Model for Ocean Science Tasks},
	author={Bi, Zhen and Zhang, Ningyu and Xue, Yida and Ou, Yixin and Ji, Daxiong and Zheng, Guozhou and Chen, Huajun},
	journal={arXiv preprint arXiv:2310.02031},
	year={2023}
	}

	```