Update README.md
Browse files
README.md
CHANGED
@@ -58,14 +58,12 @@ for o in outputs:
|
|
58 |
| Contents | Spec |
|
59 |
|--------------------------------|-------------------------------------|
|
60 |
| Base model | Qwen2.5-7B-Instruct |
|
61 |
-
| Machine | A100 SXM 80GB ร 2 |
|
62 |
| dtype | bfloat16 |
|
63 |
| PEFT | LoRA (r=8, alpha=64) |
|
64 |
| Learning Rate | 1e-5 (varies by further training) |
|
65 |
| LRScheduler | Cosine (warm-up: 0.05%) |
|
66 |
| Optimizer | AdamW |
|
67 |
| Distributed / Efficient Tuning | DeepSpeed v3, Flash Attention |
|
68 |
-
| Global Batch Size | 128 |
|
69 |
|
70 |
# Datset Card
|
71 |
Reference ๋ฐ์ดํฐ์
์ ์ผ๋ถ ์ ์๊ถ ๊ด๊ณ๋ก ์ธํด Link๋ก ์ ๊ณตํฉ๋๋ค.
|
|
|
58 |
| Contents | Spec |
|
59 |
|--------------------------------|-------------------------------------|
|
60 |
| Base model | Qwen2.5-7B-Instruct |
|
|
|
61 |
| dtype | bfloat16 |
|
62 |
| PEFT | LoRA (r=8, alpha=64) |
|
63 |
| Learning Rate | 1e-5 (varies by further training) |
|
64 |
| LRScheduler | Cosine (warm-up: 0.05%) |
|
65 |
| Optimizer | AdamW |
|
66 |
| Distributed / Efficient Tuning | DeepSpeed v3, Flash Attention |
|
|
|
67 |
|
68 |
# Datset Card
|
69 |
Reference ๋ฐ์ดํฐ์
์ ์ผ๋ถ ์ ์๊ถ ๊ด๊ณ๋ก ์ธํด Link๋ก ์ ๊ณตํฉ๋๋ค.
|