Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Github

Discord

Request more models

mptk-1b - bnb 4bits

Original model description:

license: apache-2.0 language: - ko

MPTK-1B

MPTK-1B๋Š” ํ•œ๊ตญ์–ด/์˜์–ด์ฝ”๋“œ ๋ฐ์ดํ„ฐ์…‹์—์„œ ํ•™์Šต๋œ 1.3B ํŒŒ๋ผ๋ฏธํ„ฐ์˜ decoder-only transformer ์–ธ์–ด๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

์ด ๋ชจ๋ธ์€ ๊ตฌ๊ธ€์˜ TPU Research Cloud(TRC)๋ฅผ ํ†ตํ•ด ์ง€์›๋ฐ›์€ Cloud TPU๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Model Details

Model Description

๋‹ค๋ฅธ decoder-only transformer์—์„œ ์ผ๋ถ€ ์ˆ˜์ •๋œ ์•„ํ‚คํ…์ฒ˜์ธ MPT๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

Hyperparameter Value
n_parameters 1.3B
n_layers 24
n_heads 16
d_model 2048
vocab size 50432
sequence length 2048

Uses

How to Get Started with the Model

fp16์œผ๋กœ ์‹คํ–‰ ์‹œ NaN์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ fp32 ํ˜น์€ bf16๋กœ ์‹คํ–‰ํ•˜๊ธฐ๋ฅผ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

tokenizer = AutoTokenizer.from_pretrained("team-lucid/mptk-1b")
model = AutoModelForCausalLM.from_pretrained("team-lucid/mptk-1b")

pipe = pipeline('text-generation', model=model, tokenizer=tokenizer, device='cuda:0')

with torch.autocast('cuda', dtype=torch.bfloat16):
    print(
        pipe(
            '๋Œ€ํ•œ๋ฏผ๊ตญ์˜ ์ˆ˜๋„๋Š”',
            max_new_tokens=100,
            do_sample=True,
        )
    )

Training Details

Training Data

OSCAR, mC4, wikipedia, namuwiki ๋“ฑ ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ์— RefinedWeb, The Stack ์—์„œ ์ผ๋ถ€๋ฅผ ์ถ”๊ฐ€ํ•ด ํ•™์Šตํ•˜์˜€์Šต๋‹ˆ๋‹ค.

Training Hyperparameters

Hyperparameter Value
Precision bfloat16
Optimizer Lion
Learning rate 2e-4
Batch size 1024
Downloads last month
6
Safetensors
Model size
726M params
Tensor type
F32
ยท
FP16
ยท
U8
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.