|
--- |
|
license: mit |
|
--- |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0c845a04a514ba62bcd1a/RFpsPxlc_3cK0kmWj-tYR.png) |
|
|
|
# **Introduction** |
|
We introduce Motif, a new language model family of [**Moreh**](https://moreh.io/), specialized in Korean and English.\ |
|
Motif-102B-Instruct is a chat model tuned from the base model [Motif-102B](https://huggingface.co/moreh/Motif-102B). |
|
|
|
## Training Platform |
|
- Motif-102B is trained on [**MoAI platform**](https://moreh.io/product), with AMD's MI250 GPU. |
|
- The MoAI platform simplifies scalable, cost efficient training of large-scale models across multiple nodes. |
|
- The MoAI platform also supports various optimized and automated parallelization without any complex manual works. |
|
- One can find more information on the MoAI Platform in https://moreh.io/product |
|
- Or, contact us directly [contact@moreh.io](mailto:contact@moreh.io) |
|
|
|
## Quick Usage |
|
You can chat directly with our model Motif through our [Model hub](https://model-hub.moreh.io/). |
|
|
|
## Details |
|
More details will be provided in the upcoming technical report. |
|
|
|
### Release Date |
|
2024.09.30 |
|
|
|
### Benchmark Results |
|
|
|
| Model | KMMLU | |
|
|------------------------------|-------| |
|
| GPT-4-base-0613\**| 57.62 | |
|
| Llama3.1-70B-instruct *| 52.1 | |
|
| **Motif-102B** \**+| 58.25 | |
|
| Motif-102B-Instruct \**+| 57.98 | |
|
|
|
β*β : Community reported |
|
β**β : Measured by the authors |
|
β+β : Indicates the model is specialized in Korean |
|
|
|
|
|
## How to use |
|
|
|
### Use with vLLM |
|
- Minimum requirements: 4xA100 80GB GPUs |
|
- Refer to this [link](https://github.com/vllm-project/vllm) to install vllm |
|
```python |
|
from transformers import AutoTokenizer |
|
from vllm import LLM, SamplingParams |
|
|
|
# for minimum, we recommand using 4x A100 80GB GPUs for inference with vllm |
|
# If you have more GPUs, change tensor parallel size to GPU numbers you can afford |
|
model = LLM("moreh/Motif-100B-Instruct", tensor_parallel_size=4) |
|
tokenizer = AutoTokenizer.from_pretrained("moreh/Motif-100B-Instruct") |
|
messages = [ |
|
{"role": "system", "content": "You are a helpful assistant"}, |
|
{"role": "user", "content": "μ μΉμμμκ² λΉ
λ±
μ΄λ‘ μ κ°λ
μ μ€λͺ
ν΄λ³΄μΈμ"}, |
|
] |
|
|
|
messages_batch = [tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False)] |
|
|
|
# vllm does not support generation_config of hf. So we have to set it like below |
|
sampling_params = SamplingParams(max_tokens=512, temperature=0, repetition_penalty=1.0, stop_token_ids=[tokenizer.eos_token_id]) |
|
responses = model.generate(messages_batch, sampling_params=sampling_params) |
|
|
|
print(responses[0].outputs[0].text) |
|
``` |
|
|
|
### Use with transformers |
|
- Minimum requirements: 4xA100 80GB GPUs OR 4xAMD MI250 GPUs |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model_id = "moreh/Motif-100B-Instruct" |
|
|
|
# all generation configs are set in generation_configs.json |
|
model = AutoModelForCausalLM.from_pretrained(model_id).cuda() |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
messages = [ |
|
{"role": "system", "content": "You are a helpful assistant"}, |
|
{"role": "user", "content": "μ μΉμμμκ² λΉ
λ±
μ΄λ‘ μ κ°λ
μ μ€λͺ
ν΄λ³΄μΈμ"}, |
|
] |
|
|
|
messages_batch = tokenizer.apply_chat_template(conversation=messages, add_generation_prompt=True, tokenize=False) |
|
input_ids = tokenizer(messages_batch, padding=True, return_tensors='pt')['input_ids'].cuda() |
|
|
|
outputs = model.generate(input_ids) |
|
``` |