spow12/MK_Nemo_12B

Model Description

This model is a Supervised fine-tuned version of Qwen/Qwen2.5-72B-Instruct with DeepSpeed and trl for korean.

Merge methods.

merge_method: model_stock
name: ChatWaifu_72B_V2.4
models:
    - model: Nexusflow/Athene-V2-Chat
    - model: Nexusflow/Athene-V2-Agent
    - model: Qwen/Qwen2.5-72B-Instruct_instruction_tunned(private)
    - model: anthracite-org/magnum-v4-72b
base_model: Qwen/Qwen2.5-72B-Instruct
dtype: bfloat16
tokenizer_source: base

Trained Data

  • Trained with public, private data (about 500K)

Usage

from transformers import TextStreamer, pipeline, AutoTokenizer, AutoModelForCausalLM

model_id = 'spow12/KoQwen_72B_v5.0'
tokenizer = AutoTokenizer.from_pretrained(model_id)
# %%
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    attn_implementation="flash_attention_2",  #Optional
    device_map='auto',
)
model.eval()

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map='auto')

generation_configs = dict(
    max_new_tokens=2048,
    num_return_sequences=1, 
    temperature=0.75,
    # repetition_penalty=1.1,
    do_sample=True,
    top_k=20,
    top_p=0.9,
    min_p=0.1,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.eos_token_id,
    streamer = TextStreamer(tokenizer) # Optional, if you want to use streamer, you have to set num_beams=1
)

sys_message = """당신은 μΉœμ ˆν•œ μ±—λ΄‡μœΌλ‘œμ„œ μƒλŒ€λ°©μ˜ μš”μ²­μ— μ΅œλŒ€ν•œ μžμ„Έν•˜κ³  μΉœμ ˆν•˜κ²Œ λ‹΅ν•΄μ•Όν•©λ‹ˆλ‹€. 
μ‚¬μš©μžκ°€ μ œκ³΅ν•˜λŠ” 정보λ₯Ό μ„Έμ‹¬ν•˜κ²Œ λΆ„μ„ν•˜μ—¬ μ‚¬μš©μžμ˜ μ˜λ„λ₯Ό μ‹ μ†ν•˜κ²Œ νŒŒμ•…ν•˜κ³  그에 따라 닡변을 μƒμ„±ν•΄μ•Όν•©λ‹ˆλ‹€.  

항상 맀우 μžμ—°μŠ€λŸ¬μš΄ ν•œκ΅­μ–΄λ‘œ μ‘λ‹΅ν•˜μ„Έμš”."""

message = [
    {
        'role': "system",
        'content': sys_message
    },
    {
        'role': 'user',
        'content': "ν˜„μž¬μ˜ κ²½μ œμƒν™©μ— λŒ€ν•΄ μ–΄λ–»κ²Œ 생각해?."
    }
]
conversation = pipe(message, **generation_configs)
conversation[-1]
Downloads last month
19
Safetensors
Model size
72.7B params
Tensor type
BF16
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for spow12/KoQwen_72B_v5.0