--- license: mit pipeline_tag: text-generation tags: - ocean - text-generation-inference - oceangpt language: - en - zh datasets: - zjunlp/OceanInstruct ---
OceanGPT-2B-v0.1 is based on MiniCPM-2B and has been trained on a bilingual dataset in the ocean domain, covering both Chinese and English. ## ⏩Quickstart ### Download the model Download the model: [OceanGPT-2B-v0.1](https://huggingface.co/zjunlp/OceanGPT-2B-v0.1) ```shell git lfs install git clone https://huggingface.co/zjunlp/OceanGPT-2B-v0.1 ``` or ``` huggingface-cli download --resume-download zjunlp/OceanGPT-2B-v0.1 --local-dir OceanGPT-2B-v0.1 --local-dir-use-symlinks False ``` ### Inference ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch device = "cuda" # the device to load the model onto path = 'YOUR-MODEL-PATH' model = AutoModelForCausalLM.from_pretrained( path, torch_dtype=torch.bfloat16, device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(path) prompt = "Which is the largest ocean in the world?" messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] ``` ## 📌Models | Model Name | HuggingFace | WiseModel | ModelScope | |-------------------|-----------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------| | OceanGPT-14B-v0.1 (based on Qwen) | 14B | 14B | 14B | | OceanGPT-7B-v0.2 (based on Qwen) | 7B | 7B | 7B | | OceanGPT-2B-v0.1 (based on MiniCPM) | 2B | 2B | 2B | ## 🌻Acknowledgement OceanGPT is trained based on the open-sourced large language models including [Qwen](https://huggingface.co/Qwen), [MiniCPM](https://huggingface.co/collections/openbmb/minicpm-2b-65d48bf958302b9fd25b698f), [LLaMA](https://huggingface.co/meta-llama). Thanks for their great contributions! ## Limitations - The model may have hallucination issues. - We did not optimize the identity and the model may generate identity information similar to that of Qwen/MiniCPM/LLaMA/GPT series models. - The model's output is influenced by prompt tokens, which may result in inconsistent results across multiple attempts. ### 🚩Citation Please cite the following paper if you use OceanGPT in your work. ```bibtex @article{bi2023oceangpt, title={OceanGPT: A Large Language Model for Ocean Science Tasks}, author={Bi, Zhen and Zhang, Ningyu and Xue, Yida and Ou, Yixin and Ji, Daxiong and Zheng, Guozhou and Chen, Huajun}, journal={arXiv preprint arXiv:2310.02031}, year={2023} } ```