Edit model card

Oumuamua-7b-base

This is a merge of pre-trained language models created using mergekit.

Output example

Input text

日本で最も高い山の名前は

Output text

日本で最も高い山の名前は、富士山。
その標高は3776メートル。
世界でも20位以内に入る高さを誇る。
その富士山の麓にあるのが、静岡県富士市。
富士市は、富士山の麓にあるため、観光地としても有名である。
富士山の麓にあることから、富士市は観光地としても有名である。
富士山を眺めることができるスポットが多く、特に富士市の中心部から見る富士山は、その美しさから「日本一の眺望」と言われている。

Test environment

This model was tested using text-generation-webui. I use preset min_p and Null preset with temperature=0.3 for Generation.

Usage

Use the base model

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "nitky/Oumuamua-7b-base"
tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
prompt = "日本で最も高い山の名前は"
input_ids = tokenizer.encode(
    prompt,
    add_special_tokens=False,
    return_tensors="pt"
)
tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=256,
    do_sample=True,
    temperature=0.3
)

out = tokenizer.decode(tokens[0], skip_special_tokens=True)
print(out)

Merge Details

Merge Method

This model was merged using the Model Stock merge method using tokyotech-llm/Swallow-MS-7b-v0.1 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: task_arithmetic
base_model: mistralai/Mistral-7B-v0.1
models:
  - model: tokyotech-llm/Swallow-MS-7b-v0.1
    parameters:
      weight:
      - filter: embed_tokens
        value: 1.0
      - value: 0
dtype: bfloat16
tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
name: Mistral-7B-v0.1-VE-Swallow-MS
---
merge_method: task_arithmetic
base_model: nitky/Flavor-7b # private model
models:
  - model: tokyotech-llm/Swallow-MS-7b-v0.1
    parameters:
      weight:
      - filter: embed_tokens
        value: 1.0
      - value: 0
dtype: bfloat16
tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
name: Flavor-7b-VE-Swallow-MS
---
merge_method: task_arithmetic
base_model: stabilityai/japanese-stablelm-base-gamma-7b
models:
  - model: tokyotech-llm/Swallow-MS-7b-v0.1
    parameters:
      weight:
      - filter: embed_tokens
        value: 1.0
      - value: 0
dtype: bfloat16
tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
name: japanese-stablelm-base-gamma-7b-VE-Swallow-MS
---
merge_method: task_arithmetic
base_model: Mistral-7B-v0.1-VE-Swallow-MS
models:
  - model: tokyotech-llm/Swallow-MS-7b-v0.1
    parameters:
      weight: 1.0
  - model: Flavor-7b-VE-Swallow-MS
    parameters:
      weight: 0.5
  - model: japanese-stablelm-base-gamma-7b-VE-Swallow-MS
    parameters:
      weight: -0.5
dtype: bfloat16
name: Oumuamua-7b-base-preset
---
merge_method: model_stock
base_model: Mistral-7B-v0.1-VE-Swallow-MS
models:
  - model: tokyotech-llm/Swallow-MS-7b-v0.1
  - model: Oumuamua-7b-base-preset
dtype: bfloat16
name: Oumuamua-7b-base
Downloads last month
9
Safetensors
Model size
7.33B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Merge of