neo_7b_decay / README.md

Update README.md

6901717 verified 5 months ago

3.57 kB

	---
	license: apache-2.0
	---
	# NEO

	[🤗Neo-Models](https://huggingface.co/collections/m-a-p/neo-models-66395a5c9662bb58d5d70f04) \| [🤗Neo-Datasets](https://huggingface.co/collections/m-a-p/neo-datasets-66395dc55cbebc0a7767bbd5) \| [Github](https://github.com/multimodal-art-projection/MAP-NEO)

	Neo is a completely open source large language model, including code, all model weights, datasets used for training, and training details.

	## Model

	\| Model \| Describe \| Download \|
	\|---\|---\|---\|
	neo_7b\| This repository contains the base model of neo_7b \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_7b)
	neo_7b_sft_v0.1\| This repository contains the supervised fine-tuning version of the neo_7b model. \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_7b_sft_v0.1)
	neo_7b_instruct_v0.1\| This repository contains the instruction-tuned version of the neo_7b model. \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_7b_instruct_v0.1)
	neo_7b_intermediate\| This repo contains normal pre-training intermediate ckpts. A total of 3.7T tokens were learned at this phase. \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_7b_intermediate)
	neo_7b_decay\| This repo contains intermediate ckpts during the decay phase. A total of 720B tokens were learned at this phase. \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_7b_decay)
	neo_scalinglaw_980M \| This repo contains ckpts related to scalinglaw experiments \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_scalinglaw_980M)
	neo_scalinglaw_460M \| This repo contains ckpts related to scalinglaw experiments \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_scalinglaw_460M)
	neo_scalinglaw_250M \| This repo contains ckpts related to scalinglaw experiments \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_scalinglaw_250M)
	neo_2b_general \| This repo contains ckpts of 2b model trained using common domain knowledge \| • [🤗 Hugging Face](https://huggingface.co/m-a-p/neo_2b_general)

	### Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_path = '<your-hf-model-path-with-tokenizer>'

	tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False, trust_remote_code=True)

	model = AutoModelForCausalLM.from_pretrained(
	model_path,
	device_map="auto",
	torch_dtype='auto'
	).eval()

	input_text = "A long, long time ago,"

	input_ids = tokenizer(input_text, add_generation_prompt=True, return_tensors='pt').to(model.device)
	output_ids = model.generate(**input_ids, max_new_tokens=20)
	response = tokenizer.decode(output_ids[0], skip_special_tokens=True)

	print(response)
	```

	### Citation
	```
	@article{zhang2024mapneo,
	title = {MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series},
	author = {Ge Zhang and Scott Qu and Jiaheng Liu and Chenchen Zhang and Chenghua Lin and Chou Leuang Yu and Danny Pan and Esther Cheng and Jie Liu and Qunshu Lin and Raven Yuan and Tuney Zheng and Wei Pang and Xinrun Du and Yiming Liang and Yinghao Ma and Yizhi Li and Ziyang Ma and Bill Lin and Emmanouil Benetos and Huan Yang and Junting Zhou and Kaijing Ma and Minghao Liu and Morry Niu and Noah Wang and Quehry Que and Ruibo Liu and Sine Liu and Shawn Guo and Soren Gao and Wangchunshu Zhou and Xinyue Zhang and Yizhi Zhou and Yubo Wang and Yuelin Bai and Yuhan Zhang and Yuxiang Zhang and Zenith Wang and Zhenzhu Yang and Zijian Zhao and Jiajun Zhang and Wanli Ouyang and Wenhao Huang and Wenhu Chen},
	year = {2024},
	journal = {arXiv preprint arXiv: 2405.19327}
	}
	```