Text Generation
Transformers
Safetensors
English
Japanese
mixtral
steerlm
conversational
Inference Endpoints
text-generation-inference
Edit model card

KARAKURI LM 8x7B Instruct v0.1

KARAKURI LM

Model Details

Model Description

Usage

Prompt Template

The model uses the same prompt template as Command R+, except that it contains attribute values.

Chat

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("karakuri-ai/karakuri-lm-8x7b-instruct-v0.1")

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
    {"role": "assistant", "content": "Hello! How can I help you today?"},
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)

Tool Use

messages = [
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
tools = [
    {
        "name": "internet_search",
        "description": "Returns a list of relevant document snippets for a textual query retrieved from the internet",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Query to search the internet with"
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "directly_answer",
        "description": "Calls a standard (un-augmented) AI chatbot to generate a response given the conversation history",
        "parameters": {
            "type": "object",
            "properties": {}
        }
    }
]
tokenizer.apply_chat_template(
    messages,
    chat_template="tool_use",
    tools=tools,
    add_generation_prompt=True,
    tokenize=False,
)

RAG

messages = [
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
documents = [
    {
        "title": "Tsukiji Outer Market",
        "text": "While the inner wholesale market has moved to Toyosu, Tsukiji Outer Market remains a bustling hub for fresh seafood and street food. Enjoy sushi, sashimi, and other delicacies while exploring the vibrant market streets.",
    },
    {
        "title": "Meiji Shrine",
        "text": "Nestled in a lush forest in the heart of the city, Meiji Shrine offers a peaceful retreat from the urban hustle. Dedicated to Emperor Meiji and Empress Shoken, the shrine is a popular site for traditional Japanese weddings. Stroll along the serene paths and experience a moment of tranquility."
    }
]
tokenizer.apply_chat_template(
    messages,
    chat_template="rag",
    documents=documents,
    add_generation_prompt=True,
    tokenize=False,
)

Attribute Values

The prompt template contains nine attributes. The first five are derived from HelpSteer, while the remaining four are derived from OASST2. The values are represented by integers ranging from 0 to 4, with 0 being the lowest and 4 being the highest.

  • helpfulness (default: 4): Overall helpfulness of the response to the prompt.
  • correctness (default: 4): Inclusion of all pertinent facts without errors.
  • coherence (default: 4): Consistency and clarity of expression.
  • complexity (default: 4): Intellectual depth required to write response (i.e. whether the response can be written by anyone with basic language competency or requires deep domain expertise).
  • verbosity (default: 4): Amount of detail included in the response, relative to what is asked for in the prompt.
  • quality (default: 4): Perceived goodness of response.
  • toxicity (default: 0): Undesirable elements such as vulgar, harmful or potentially biased response.
  • humor (default: 0): Sense of humor within response.
  • creativity (default: 0): Willingness to generate non-conventional response.

If you want to change the attribute values from the default values specified in the template, you can pass them as arguments to the apply_chat_template method as follows:

messages = [
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]
tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=False,
    helpfulness=0,
    correctness=0,
    coherence=2,
    complexity=0,
    verbosity=3,
    quality=0,
    toxicity=4,
    humor=1,
    creativity=1,
)

Run the model

from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained(
    "karakuri-ai/karakuri-lm-8x7b-instruct-v0.1",
    torch_dtype="auto",
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "I'm planning a day trip to Tokyo this weekend. Can you recommend a quick sightseeing plan?"}
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)
outputs = model.generate(input_ids, max_new_tokens=512)
tokenizer.decode(outputs[0][input_ids.shape[-1]:])

Training Details

Training Data

The model was trained on approximately 1 billion tokens of fine-tuning data. The details are as follows:

Dataset # Tokens / Epoch # Epochs # Tokens Percent
databricks/databricks-dolly-15k 3M 5 16M 1.5%
glaiveai/glaive-code-assistant-v3 520M 0.3 156M 14.6%
glaiveai/glaive-function-calling-v2 52M 3 157M 14.7%
gretelai/synthetic_text_to_sql 19M 3 57M 5.3%
meta-math/MetaMathQA 81M 1 81M 7.6%
microsoft/orca-math-word-problems-200k 67M 1 67M 6.3%
neural-bridge/rag-dataset-12000 12M 5 61M 5.7%
neural-bridge/rag-hallucination-dataset-1000 1M 5 5M 0.5%
nvidia/HelpSteer 24M 5 118M 11.0%
OpenAssistant/oasst2 27M 5 133M 12.4%
KARAKURI Instruction Dataset 1M 5 6M 0.6%
KARAKURI Corpus 214M 1 214M 20.0%

Training Infrastructure

  • Hardware: The model was trained on 8 nodes of an Amazon EC2 trn1.32xlarge instance.
  • Software: We use code based on neuronx-nemo-megatron.

Known Limitations

The model sometimes attempts to call unprovided tools. You should implement a post-process to exclude those tools.

Citation

@misc{karakuri_lm_8x7b_instruct_v01,
    author       = { {KARAKURI} {I}nc. },
    title        = { {KARAKURI} {LM} 8x7{B} {I}nstruct v0.1 },
    year         = { 2024 },
    url          = { https://huggingface.co/karakuri-ai/karakuri-lm-8x7b-instruct-v0.1 },
    publisher    = { Hugging Face },
    journal      = { Hugging Face repository }
}
Downloads last month
239
Safetensors
Model size
46.7B params
Tensor type
BF16
·

Finetuned from

Datasets used to train karakuri-ai/karakuri-lm-8x7b-instruct-v0.1

Collection including karakuri-ai/karakuri-lm-8x7b-instruct-v0.1