--- license: mit language: - ja library_name: transformers pipeline_tag: text-generation tags: - gpt_neox - gpt-neox - japanese inference: parameters: max_new_tokens: 32 do_sample: false repetition_penalty: 1.1 --- # stockmark/gpt-neox-japanese-1.4b This repository provides a GPT-NeoX based model with 1.4B parameters pre-trained on Japanese corpus of about 20B tokens. This model is developed by [Stockmark Inc.](https://stockmark.co.jp/) ## How to use ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer # Use torch.bfloat16 for A100 GPU and torch.flaot16 for the older generation GPUs torch_dtype = torch.bfloat16 if torch.cuda.is_available() and hasattr(torch.cuda, "is_bf16_supported") and torch.cuda.is_bf16_supported() else torch.float16 model = AutoModelForCausalLM.from_pretrained("stockmark/gpt-neox-japanese-1.4b", device_map="auto", torch_dtype=torch_dtype) tokenizer = AutoTokenizer.from_pretrained("stockmark/gpt-neox-japanese-1.4b") inputs = tokenizer("自然言語処理は", return_tensors="pt").to(model.device) with torch.no_grad(): tokens = model.generate( **inputs, max_new_tokens=128, repetition_penalty=1.1 ) output = tokenizer.decode(tokens[0], skip_special_tokens=True) print(output) ``` ## Example: - LoRA tuning: https://huggingface.co/stockmark/gpt-neox-japanese-1.4b/blob/main/notebooks/LoRA.ipynb ## Training dataset - Japanese Web Corpus (ja): 8.6B tokens (This dataset will not be released.) - Wikipedia (ja): 0.88B tokens - CC100 (ja): 10.5B tokens ## Training setting - Trained using HuggingFace Trainer and DeepSpeed (ZeRO-2) - 8 A100 GPUs (40GB) at ABCI - Mixed Precision (BF16) ## License [The MIT license](https://opensource.org/licenses/MIT) ## Developed by [Stockmark Inc.](https://stockmark.co.jp/) ## Author [Takahiro Omi](https://huggingface.co/omitakahiro)