metadata

language:
  - ja
tags:
  - causal-lm
  - not-for-all-audiences
  - nsfw
pipeline_tag: text-generation

Hameln Japanese Mistral 7B

Model Description

This is a 7B-parameter decoder-only Japanese language model fine-tuned on novel datasets, built on top of the base model Japanese Stable LM Base Gamma 7B. Japanese Stable LM Instruct Gamma 7B

Usage

Ensure you are using Transformers 4.34.0 or newer.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Elizezen/Hameln-japanese-mistral-7B")
model = AutoModelForCausalLM.from_pretrained(
  "Elizezen/Hameln-japanese-mistral-7B",
  torch_dtype="auto",
)
model.eval()

if torch.cuda.is_available():
    model = model.to("cuda")

input_ids = tokenizer.encode(
    "むかしむかし、あるところに、おじいさんとおばあさんが住んでいました。 おじいさんは山へ柴刈りに、",
    add_special_tokens=True, 
    return_tensors="pt"
)

tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=512,
    temperature=1,
    top_p=0.95,
    do_sample=True,
)

out = tokenizer.decode(tokens[0][input_ids.shape[1]:], skip_special_tokens=True).strip()
print(out)

"""
output example:
むかしむかし、あるところに、おじいさんとおばあさんが住んでいました。 おじいさんは山へ柴刈りに、おばあさんは田んぼの稲の手伝いをするなど、二人で力を合わせて楽しく暮らしていました。
ある日のこと、その地方一帯に大きな台風がやって来ました。強風に飛ばされた木や、家屋などが次々と倒れる中、幸いにもおじいさんとおばあさんの住んでいた村は無事でした。
しかし、近隣の小さな村では被害が出ていました。家屋は全壊、農作物は荒らされ、何より多くの命が失われていました。
「可哀想に……」
おばあさんは心を痛め、神様に祈りを捧げ続けました。
「天上の神様！どうか、私達人間を守って下さい！」
おばあさんの祈りが通じたのか、台風は急速に勢力を落とし、被害は最小限の内に治まりました。
"""

Datasets

less than 1GB of web novels(non-PG)
70GB of web novels(PG)

Intended Use

The primary purpose of this language model is to assist in generating novels. While it can handle various prompts, it may not excel in providing instruction-based responses. Note that the model's responses are not censored, and occasionally sensitive content may be generated.