omitakahiro's picture
Update README.md
8f254d7 verified
|
raw
history blame
2.78 kB
metadata
library_name: transformers
license: mit
language:
  - ja
  - en

stockmark/stockmark-100b-instruct-v0.1

Stockmark-100b-instruct-v0.1 is an instruction tuned version of stockmark-100b, a 100 billion parameter LLM developed by Stockmark Inc.

How to use

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

prompt_template = """### 指示:
{instruction}

### 応答:
"""

tokenizer = AutoTokenizer.from_pretrained("stockmark/stockmark-100b-instruct-v0.1")
model = AutoModelForCausalLM.from_pretrained("stockmark/stockmark-100b-instruct-v0.1", device_map="auto", torch_dtype=torch.bfloat16)

instruction = "生成AIとは?"
prompt = prompt_template.format(instruction=instruction)
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(model.device)
with torch.inference_mode():
    tokens = model.generate(
        input_ids,
        max_new_tokens = 256,
        do_sample = True,
        temperature = 0.7,
        top_p = 0.95,
        repetition_penalty = 1.08
    )
    
output = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(output)

Dataset (fine-tuning)

Performance

Stockmark Business Questions

Dataset: https://huggingface.co/datasets/stockmark/business-questions

model accuracy
stockmark-100b-instruct 0.90
stockmark-13b-instruct 0.80
GPT-3.5-turbo^1 0.42

Japanese Vicuna QA Benchmark

We exclud categories that require calculation and coding, and use remaining 60 questions for evaluation.

GitHub: https://github.com/ku-nlp/ja-vicuna-qa-benchmark

model average score
stockmark-100b-instruct 5.97
tokyotech-llm/Swallow-70b-instruct-hf 5.59
GPT-3.5 (text-davinci-003) 5.08

Inference speed

model time [s] for genrating 100 characters in Japanese
stockmark-100b-instruct 1.86
gpt-3.5-turbo 2.15
gpt-4-turbo 5.48
tokyotech-llm/Swallow-70b-instruct-hf 2.22

For local LLMs, we measured the inference time using AWS Inferentia2.

License

MIT

Developed by

Stockmark Inc.