# ๐Ÿค— Accelerate๋ฅผ ํ™œ์šฉํ•œ ๋ถ„์‚ฐ ํ•™์Šต[[distributed-training-with-accelerate]] ๋ชจ๋ธ์ด ์ปค์ง€๋ฉด์„œ ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ๋Š” ์ œํ•œ๋œ ํ•˜๋“œ์›จ์–ด์—์„œ ๋” ํฐ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ณ  ํ›ˆ๋ จ ์†๋„๋ฅผ ๋ช‡ ๋ฐฐ๋กœ ๊ฐ€์†ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ์ „๋žต์œผ๋กœ ๋“ฑ์žฅํ–ˆ์Šต๋‹ˆ๋‹ค. Hugging Face์—์„œ๋Š” ์‚ฌ์šฉ์ž๊ฐ€ ํ•˜๋‚˜์˜ ๋จธ์‹ ์— ์—ฌ๋Ÿฌ ๊ฐœ์˜ GPU๋ฅผ ์‚ฌ์šฉํ•˜๋“  ์—ฌ๋Ÿฌ ๋จธ์‹ ์— ์—ฌ๋Ÿฌ ๊ฐœ์˜ GPU๋ฅผ ์‚ฌ์šฉํ•˜๋“  ๋ชจ๋“  ์œ ํ˜•์˜ ๋ถ„์‚ฐ ์„ค์ •์—์„œ ๐Ÿค— Transformers ๋ชจ๋ธ์„ ์‰ฝ๊ฒŒ ํ›ˆ๋ จํ•  ์ˆ˜ ์žˆ๋„๋ก ๋•๊ธฐ ์œ„ํ•ด [๐Ÿค— Accelerate](https://huggingface.co/docs/accelerate) ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ๋ถ„์‚ฐ ํ™˜๊ฒฝ์—์„œ ํ›ˆ๋ จํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ธฐ๋ณธ PyTorch ํ›ˆ๋ จ ๋ฃจํ”„๋ฅผ ์ปค์Šคํ„ฐ๋งˆ์ด์ฆˆํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ด…์‹œ๋‹ค. ## ์„ค์ •[[setup]] ๐Ÿค— Accelerate ์„ค์น˜ ์‹œ์ž‘ํ•˜๊ธฐ: ```bash pip install accelerate ``` ๊ทธ ๋‹ค์Œ, [`~accelerate.Accelerator`] ๊ฐ์ฒด๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ  ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. [`~accelerate.Accelerator`]๋Š” ์ž๋™์œผ๋กœ ๋ถ„์‚ฐ ์„ค์ • ์œ ํ˜•์„ ๊ฐ์ง€ํ•˜๊ณ  ํ›ˆ๋ จ์— ํ•„์š”ํ•œ ๋ชจ๋“  ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ์žฅ์น˜์— ๋ชจ๋ธ์„ ๋ช…์‹œ์ ์œผ๋กœ ๋ฐฐ์น˜ํ•  ํ•„์š”๋Š” ์—†์Šต๋‹ˆ๋‹ค. ```py >>> from accelerate import Accelerator >>> accelerator = Accelerator() ``` ## ๊ฐ€์†ํ™”๋ฅผ ์œ„ํ•œ ์ค€๋น„[[prepare-to-accelerate]] ๋‹ค์Œ ๋‹จ๊ณ„๋Š” ๊ด€๋ จ๋œ ๋ชจ๋“  ํ›ˆ๋ จ ๊ฐ์ฒด๋ฅผ [`~accelerate.Accelerator.prepare`] ๋ฉ”์†Œ๋“œ์— ์ „๋‹ฌํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ํ›ˆ๋ จ ๋ฐ ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ๋กœ๋”, ๋ชจ๋ธ ๋ฐ ์˜ตํ‹ฐ๋งˆ์ด์ €๊ฐ€ ํฌํ•จ๋ฉ๋‹ˆ๋‹ค: ```py >>> train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare( ... train_dataloader, eval_dataloader, model, optimizer ... ) ``` ## ๋ฐฑ์›Œ๋“œ(Backward)[[backward]] ๋งˆ์ง€๋ง‰์œผ๋กœ ํ›ˆ๋ จ ๋ฃจํ”„์˜ ์ผ๋ฐ˜์ ์ธ `loss.backward()`๋ฅผ ๐Ÿค— Accelerate์˜ [`~accelerate.Accelerator.backward`] ๋ฉ”์†Œ๋“œ๋กœ ๋Œ€์ฒดํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค: ```py >>> for epoch in range(num_epochs): ... for batch in train_dataloader: ... outputs = model(**batch) ... loss = outputs.loss ... accelerator.backward(loss) ... optimizer.step() ... lr_scheduler.step() ... optimizer.zero_grad() ... progress_bar.update(1) ``` ๋‹ค์Œ ์ฝ”๋“œ์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋“ฏ์ด, ํ›ˆ๋ จ ๋ฃจํ”„์— ์ฝ”๋“œ ๋„ค ์ค„๋งŒ ์ถ”๊ฐ€ํ•˜๋ฉด ๋ถ„์‚ฐ ํ•™์Šต์„ ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค! ```diff + from accelerate import Accelerator from transformers import AdamW, AutoModelForSequenceClassification, get_scheduler + accelerator = Accelerator() model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2) optimizer = AdamW(model.parameters(), lr=3e-5) - device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") - model.to(device) + train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare( + train_dataloader, eval_dataloader, model, optimizer + ) num_epochs = 3 num_training_steps = num_epochs * len(train_dataloader) lr_scheduler = get_scheduler( "linear", optimizer=optimizer, num_warmup_steps=0, num_training_steps=num_training_steps ) progress_bar = tqdm(range(num_training_steps)) model.train() for epoch in range(num_epochs): for batch in train_dataloader: - batch = {k: v.to(device) for k, v in batch.items()} outputs = model(**batch) loss = outputs.loss - loss.backward() + accelerator.backward(loss) optimizer.step() lr_scheduler.step() optimizer.zero_grad() progress_bar.update(1) ``` ## ํ•™์Šต[[train]] ๊ด€๋ จ ์ฝ”๋“œ๋ฅผ ์ถ”๊ฐ€ํ•œ ํ›„์—๋Š” ์Šคํฌ๋ฆฝํŠธ๋‚˜ Colaboratory์™€ ๊ฐ™์€ ๋…ธํŠธ๋ถ์—์„œ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜์„ธ์š”. ### ์Šคํฌ๋ฆฝํŠธ๋กœ ํ•™์Šตํ•˜๊ธฐ[[train-with-a-script]] ์Šคํฌ๋ฆฝํŠธ์—์„œ ํ›ˆ๋ จ์„ ์‹คํ–‰ํ•˜๋Š” ๊ฒฝ์šฐ, ๋‹ค์Œ ๋ช…๋ น์„ ์‹คํ–‰ํ•˜์—ฌ ๊ตฌ์„ฑ ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ณ  ์ €์žฅํ•ฉ๋‹ˆ๋‹ค: ```bash accelerate config ``` Then launch your training with: ```bash accelerate launch train.py ``` ### ๋…ธํŠธ๋ถ์œผ๋กœ ํ•™์Šตํ•˜๊ธฐ[[train-with-a-notebook]] Collaboratory์˜ TPU๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๊ฒฝ์šฐ, ๋…ธํŠธ๋ถ์—์„œ๋„ ๐Ÿค— Accelerate๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ›ˆ๋ จ์„ ๋‹ด๋‹นํ•˜๋Š” ๋ชจ๋“  ์ฝ”๋“œ๋ฅผ ํ•จ์ˆ˜๋กœ ๊ฐ์‹ธ์„œ [`~accelerate.notebook_launcher`]์— ์ „๋‹ฌํ•˜์„ธ์š”: ```py >>> from accelerate import notebook_launcher >>> notebook_launcher(training_function) ``` ๐Ÿค— Accelerate ๋ฐ ๋‹ค์–‘ํ•œ ๊ธฐ๋Šฅ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [documentation](https://huggingface.co/docs/accelerate)๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.