ahassoun's picture
Upload 3018 files
ee6e328

๋‘˜๋Ÿฌ๋ณด๊ธฐ [[quick-tour]]

[[open-in-colab]]

๐Ÿค— Transformers๋ฅผ ์‹œ์ž‘ํ•ด๋ณด์„ธ์š”! ๊ฐœ๋ฐœํ•ด๋ณธ ์ ์ด ์—†๋”๋ผ๋„ ์‰ฝ๊ฒŒ ์ฝ์„ ์ˆ˜ ์žˆ๋„๋ก ์“ฐ์ธ ์ด ๊ธ€์€ pipeline์„ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”๋ก ํ•˜๊ณ , ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ๊ณผ ์ „์ฒ˜๋ฆฌ๊ธฐ๋ฅผ AutoClass๋กœ ๋กœ๋“œํ•˜๊ณ , PyTorch ๋˜๋Š” TensorFlow๋กœ ๋ชจ๋ธ์„ ๋น ๋ฅด๊ฒŒ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ์†Œ๊ฐœํ•ด ๋“œ๋ฆด ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ณธ ๊ฐ€์ด๋“œ์—์„œ ์†Œ๊ฐœ๋˜๋Š” ๊ฐœ๋…์„ (ํŠนํžˆ ์ดˆ๋ณด์ž์˜ ๊ด€์ ์œผ๋กœ) ๋” ์นœ์ ˆํ•˜๊ฒŒ ์ ‘ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด, ํŠœํ† ๋ฆฌ์–ผ์ด๋‚˜ ์ฝ”์Šค๋ฅผ ์ฐธ์กฐํ•˜๊ธฐ๋ฅผ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค.

์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๋ชจ๋‘ ์„ค์น˜๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”:

!pip install transformers datasets

๋˜ํ•œ ์„ ํ˜ธํ•˜๋Š” ๋จธ์‹  ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์„ค์น˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:

pip install torch
pip install tensorflow

ํŒŒ์ดํ”„๋ผ์ธ [[pipeline]]

pipeline์€ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ๋กœ ์ถ”๋ก ํ•˜๊ธฐ์— ๊ฐ€์žฅ ์‰ฝ๊ณ  ๋น ๋ฅธ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. [pipeline]์€ ์—ฌ๋Ÿฌ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์—์„œ ๋‹ค์–‘ํ•œ ๊ณผ์—…์„ ์‰ฝ๊ฒŒ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์•„๋ž˜ ํ‘œ์— ํ‘œ์‹œ๋œ ๋ช‡ ๊ฐ€์ง€ ๊ณผ์—…์„ ๊ธฐ๋ณธ์ ์œผ๋กœ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค:

์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ž‘์—…์˜ ์ „์ฒด ๋ชฉ๋ก์€ Pipelines API ์ฐธ์กฐ๋ฅผ ํ™•์ธํ•˜์„ธ์š”.

ํƒœ์Šคํฌ ์„ค๋ช… ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ ํŒŒ์ดํ”„๋ผ์ธ ID
ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ ํ…์ŠคํŠธ์— ์•Œ๋งž์€ ๋ ˆ์ด๋ธ” ๋ถ™์ด๊ธฐ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) pipeline(task="sentiment-analysis")
ํ…์ŠคํŠธ ์ƒ์„ฑ ์ฃผ์–ด์ง„ ๋ฌธ์ž์—ด ์ž…๋ ฅ๊ณผ ์ด์–ด์ง€๋Š” ํ…์ŠคํŠธ ์ƒ์„ฑํ•˜๊ธฐ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) pipeline(task="text-generation")
๊ฐœ์ฒด๋ช… ์ธ์‹ ๋ฌธ์ž์—ด์˜ ๊ฐ ํ† ํฐ๋งˆ๋‹ค ์•Œ๋งž์€ ๋ ˆ์ด๋ธ” ๋ถ™์ด๊ธฐ (์ธ๋ฌผ, ์กฐ์ง, ์žฅ์†Œ ๋“ฑ๋“ฑ) ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) pipeline(task="ner")
์งˆ์˜์‘๋‹ต ์ฃผ์–ด์ง„ ๋ฌธ๋งฅ๊ณผ ์งˆ๋ฌธ์— ๋”ฐ๋ผ ์˜ฌ๋ฐ”๋ฅธ ๋Œ€๋‹ตํ•˜๊ธฐ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) pipeline(task="question-answering")
๋นˆ์นธ ์ฑ„์šฐ๊ธฐ ๋ฌธ์ž์—ด์˜ ๋นˆ์นธ์— ์•Œ๋งž์€ ํ† ํฐ ๋งž์ถ”๊ธฐ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) pipeline(task="fill-mask")
์š”์•ฝ ํ…์ŠคํŠธ๋‚˜ ๋ฌธ์„œ๋ฅผ ์š”์•ฝํ•˜๊ธฐ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) pipeline(task="summarization")
๋ฒˆ์—ญ ํ…์ŠคํŠธ๋ฅผ ํ•œ ์–ธ์–ด์—์„œ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ธฐ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) pipeline(task="translation")
์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ์ด๋ฏธ์ง€์— ์•Œ๋งž์€ ๋ ˆ์ด๋ธ” ๋ถ™์ด๊ธฐ ์ปดํ“จํ„ฐ ๋น„์ „(CV) pipeline(task="image-classification")
์ด๋ฏธ์ง€ ๋ถ„ํ•  ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€๋งˆ๋‹ค ๋ ˆ์ด๋ธ” ๋ถ™์ด๊ธฐ(์‹œ๋งจํ‹ฑ, ํŒŒ๋†‰ํ‹ฑ ๋ฐ ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  ํฌํ•จ) ์ปดํ“จํ„ฐ ๋น„์ „(CV) pipeline(task="image-segmentation")
๊ฐ์ฒด ํƒ์ง€ ์ด๋ฏธ์ง€ ์† ๊ฐ์ฒด์˜ ๊ฒฝ๊ณ„ ์ƒ์ž๋ฅผ ๊ทธ๋ฆฌ๊ณ  ํด๋ž˜์Šค๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ ์ปดํ“จํ„ฐ ๋น„์ „(CV) pipeline(task="object-detection")
์˜ค๋””์˜ค ๋ถ„๋ฅ˜ ์˜ค๋””์˜ค ํŒŒ์ผ์— ์•Œ๋งž์€ ๋ ˆ์ด๋ธ” ๋ถ™์ด๊ธฐ ์˜ค๋””์˜ค pipeline(task="audio-classification")
์ž๋™ ์Œ์„ฑ ์ธ์‹ ์˜ค๋””์˜ค ํŒŒ์ผ ์† ์Œ์„ฑ์„ ํ…์ŠคํŠธ๋กœ ๋ฐ”๊พธ๊ธฐ ์˜ค๋””์˜ค pipeline(task="automatic-speech-recognition")
์‹œ๊ฐ ์งˆ์˜์‘๋‹ต ์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€์™€ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋Œ€๋‹ตํ•˜๊ธฐ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ pipeline(task="vqa")
๋ฌธ์„œ ์งˆ์˜์‘๋‹ต ์ฃผ์–ด์ง„ ๋ฌธ์„œ์™€ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋Œ€๋‹ตํ•˜๊ธฐ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ pipeline(task="document-question-answering")
์ด๋ฏธ์ง€ ์บก์…˜ ๋‹ฌ๊ธฐ ์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€์˜ ์บก์…˜ ์ƒ์„ฑํ•˜๊ธฐ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ pipeline(task="image-to-text")

๋จผ์ € [pipeline]์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ์‚ฌ์šฉํ•  ์ž‘์—…์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฐ€์ด๋“œ์—์„œ๋Š” ๊ฐ์ • ๋ถ„์„์„ ์œ„ํ•ด [pipeline]์„ ์‚ฌ์šฉํ•˜๋Š” ์˜ˆ์ œ๋ฅผ ๋ณด์—ฌ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค:

>>> from transformers import pipeline

>>> classifier = pipeline("sentiment-analysis")

[pipeline]์€ ๊ฐ์ • ๋ถ„์„์„ ์œ„ํ•œ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ž๋™์œผ๋กœ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  ์บ์‹œํ•ฉ๋‹ˆ๋‹ค. ์ด์ œ classifier๋ฅผ ๋Œ€์ƒ ํ…์ŠคํŠธ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> classifier("We are very happy to show you the ๐Ÿค— Transformers library.")
[{'label': 'POSITIVE', 'score': 0.9998}]

๋งŒ์•ฝ ์ž…๋ ฅ์ด ์—ฌ๋Ÿฌ ๊ฐœ ์žˆ๋Š” ๊ฒฝ์šฐ, ์ž…๋ ฅ์„ ๋ฆฌ์ŠคํŠธ๋กœ [pipeline]์— ์ „๋‹ฌํ•˜์—ฌ, ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์„ ๋”•์…”๋„ˆ๋ฆฌ๋กœ ์ด๋ฃจ์–ด์ง„ ๋ฆฌ์ŠคํŠธ ํ˜•ํƒœ๋กœ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> results = classifier(["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."])
>>> for result in results:
...     print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
label: POSITIVE, with score: 0.9998
label: NEGATIVE, with score: 0.5309

[pipeline]์€ ์ฃผ์–ด์ง„ ๊ณผ์—…์— ๊ด€๊ณ„์—†์ด ๋ฐ์ดํ„ฐ์…‹ ์ „๋ถ€๋ฅผ ์ˆœํšŒํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์˜ˆ์ œ์—์„œ๋Š” ์ž๋™ ์Œ์„ฑ ์ธ์‹์„ ๊ณผ์—…์œผ๋กœ ์„ ํƒํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

>>> import torch
>>> from transformers import pipeline

>>> speech_recognizer = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h")

๋ฐ์ดํ„ฐ์…‹์„ ๋กœ๋“œํ•  ์ฐจ๋ก€์ž…๋‹ˆ๋‹ค. (์ž์„ธํ•œ ๋‚ด์šฉ์€ ๐Ÿค— Datasets ์‹œ์ž‘ํ•˜๊ธฐ์„ ์ฐธ์กฐํ•˜์„ธ์š”) ์—ฌ๊ธฐ์—์„œ๋Š” MInDS-14 ๋ฐ์ดํ„ฐ์…‹์„ ๋กœ๋“œํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค:

>>> from datasets import load_dataset, Audio

>>> dataset = load_dataset("PolyAI/minds14", name="en-US", split="train")  # doctest: +IGNORE_RESULT

๋ฐ์ดํ„ฐ์…‹์˜ ์ƒ˜ํ”Œ๋ง ๋ ˆ์ดํŠธ๊ฐ€ ๊ธฐ์กด ๋ชจ๋ธ์ธ facebook/wav2vec2-base-960h์˜ ํ›ˆ๋ จ ๋‹น์‹œ ์ƒ˜ํ”Œ๋ง ๋ ˆ์ดํŠธ์™€ ์ผ์น˜ํ•˜๋Š”์ง€ ํ™•์ธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:

>>> dataset = dataset.cast_column("audio", Audio(sampling_rate=speech_recognizer.feature_extractor.sampling_rate))

"audio" ์—ด์„ ํ˜ธ์ถœํ•˜๋ฉด ์ž๋™์œผ๋กœ ์˜ค๋””์˜ค ํŒŒ์ผ์„ ๊ฐ€์ ธ์™€์„œ ๋ฆฌ์ƒ˜ํ”Œ๋งํ•ฉ๋‹ˆ๋‹ค. ์ฒซ 4๊ฐœ ์ƒ˜ํ”Œ์—์„œ ์›์‹œ ์›จ์ด๋ธŒํผ ๋ฐฐ์—ด์„ ์ถ”์ถœํ•˜๊ณ  ํŒŒ์ดํ”„๋ผ์ธ์— ๋ฆฌ์ŠคํŠธ๋กœ ์ „๋‹ฌํ•˜์„ธ์š”:

>>> result = speech_recognizer(dataset[:4]["audio"])
>>> print([d["text"] for d in result])
['I WOULD LIKE TO SET UP A JOINT ACCOUNT WITH MY PARTNER HOW DO I PROCEED WITH DOING THAT', "FONDERING HOW I'D SET UP A JOIN TO HELL T WITH MY WIFE AND WHERE THE AP MIGHT BE", "I I'D LIKE TOY SET UP A JOINT ACCOUNT WITH MY PARTNER I'M NOT SEEING THE OPTION TO DO IT ON THE APSO I CALLED IN TO GET SOME HELP CAN I JUST DO IT OVER THE PHONE WITH YOU AND GIVE YOU THE INFORMATION OR SHOULD I DO IT IN THE AP AN I'M MISSING SOMETHING UQUETTE HAD PREFERRED TO JUST DO IT OVER THE PHONE OF POSSIBLE THINGS", 'HOW DO I FURN A JOINA COUT']

์Œ์„ฑ์ด๋‚˜ ๋น„์ „๊ณผ ๊ฐ™์ด ์ž…๋ ฅ์ด ํฐ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹์˜ ๊ฒฝ์šฐ, ๋ชจ๋“  ์ž…๋ ฅ์„ ๋ฉ”๋ชจ๋ฆฌ์— ๋กœ๋“œํ•˜๋ ค๋ฉด ๋ฆฌ์ŠคํŠธ ๋Œ€์‹  ์ œ๋„ˆ๋ ˆ์ดํ„ฐ ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ Pipelines API ์ฐธ์กฐ๋ฅผ ํ™•์ธํ•˜์„ธ์š”.

ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ € ์‚ฌ์šฉํ•˜๊ธฐ [[use-another-model-and-tokenizer-in-the-pipeline]]

[pipeline]์€ Hub์˜ ๋ชจ๋“  ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, [pipeline]์„ ๋‹ค๋ฅธ ์šฉ๋„์— ๋งž๊ฒŒ ์‰ฝ๊ฒŒ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ๋ฅผ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด์„  Hub์˜ ํƒœ๊ทธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ ์ ˆํ•œ ๋ชจ๋ธ์„ ํ•„ํ„ฐ๋งํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ํ•„ํ„ฐ๋ง๋œ ๊ฒฐ๊ณผ์˜ ์ƒ์œ„ ํ•ญ๋ชฉ์œผ๋กœ๋Š” ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค๊ตญ์–ด BERT ๋ชจ๋ธ์ด ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค:

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
[`AutoModelForSequenceClassification`]๊ณผ [`AutoTokenizer`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ๊ณผ ๊ด€๋ จ๋œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋กœ๋“œํ•˜์„ธ์š” (๋‹ค์Œ ์„น์…˜์—์„œ [`AutoClass`]์— ๋Œ€ํ•ด ๋” ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค):
>>> from transformers import AutoTokenizer, AutoModelForSequenceClassification

>>> model = AutoModelForSequenceClassification.from_pretrained(model_name)
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)
[`TFAutoModelForSequenceClassification`]๊ณผ [`AutoTokenizer`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ๊ณผ ๊ด€๋ จ๋œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋กœ๋“œํ•˜์„ธ์š” (๋‹ค์Œ ์„น์…˜์—์„œ [`TFAutoClass`]์— ๋Œ€ํ•ด ๋” ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค):
>>> from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

>>> model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)

[pipeline]์—์„œ ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ง€์ •ํ•˜๋ฉด, ์ด์ œ classifier๋ฅผ ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer)
>>> classifier("Nous sommes trรจs heureux de vous prรฉsenter la bibliothรจque ๐Ÿค— Transformers.")
[{'label': '5 stars', 'score': 0.7273}]

๋งˆ๋•…ํ•œ ๋ชจ๋ธ์„ ์ฐพ์„ ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ๋ฏธ์„ธ์กฐ์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ฏธ์„ธ์กฐ์ • ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋ฏธ์„ธ์กฐ์ • ํŠœํ† ๋ฆฌ์–ผ์„ ์ฐธ์กฐํ•˜์„ธ์š”. ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ๋ฏธ์„ธ์กฐ์ •ํ•œ ํ›„์—๋Š” ๋ชจ๋ธ์„ Hub์˜ ์ปค๋ฎค๋‹ˆํ‹ฐ์™€ ๊ณต์œ ํ•˜์—ฌ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฏผ์ฃผํ™”์— ๊ธฐ์—ฌํ•ด์ฃผ์„ธ์š”! ๐Ÿค—

AutoClass [[autoclass]]

[AutoModelForSequenceClassification]๊ณผ [AutoTokenizer] ํด๋ž˜์Šค๋Š” ์œ„์—์„œ ๋‹ค๋ฃฌ [pipeline]์˜ ๊ธฐ๋Šฅ์„ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. AutoClass๋Š” ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์˜ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ด๋ฆ„์ด๋‚˜ ๊ฒฝ๋กœ์—์„œ ์ž๋™์œผ๋กœ ๊ฐ€์ ธ์˜ค๋Š” '๋ฐ”๋กœ๊ฐ€๊ธฐ'์ž…๋‹ˆ๋‹ค. ๊ณผ์—…์— ์ ํ•ฉํ•œ AutoClass๋ฅผ ์„ ํƒํ•˜๊ณ  ํ•ด๋‹น ์ „์ฒ˜๋ฆฌ ํด๋ž˜์Šค๋ฅผ ์„ ํƒํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

์ด์ „ ์„น์…˜์˜ ์˜ˆ์ œ๋กœ ๋Œ์•„๊ฐ€์„œ [pipeline]์˜ ๊ฒฐ๊ณผ๋ฅผ AutoClass๋ฅผ ํ™œ์šฉํ•ด ๋ณต์ œํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

AutoTokenizer [[autotokenizer]]

ํ† ํฌ๋‚˜์ด์ €๋Š” ํ…์ŠคํŠธ๋ฅผ ๋ชจ๋ธ์˜ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด ์ˆซ์ž ๋ฐฐ์—ด ํ˜•ํƒœ๋กœ ์ „์ฒ˜๋ฆฌํ•˜๋Š” ์—ญํ• ์„ ๋‹ด๋‹นํ•ฉ๋‹ˆ๋‹ค. ํ† ํฐํ™” ๊ณผ์ •์—๋Š” ๋‹จ์–ด๋ฅผ ์–ด๋””์—์„œ ๋Š์„์ง€, ์–ด๋Š ์ˆ˜์ค€๊นŒ์ง€ ๋‚˜๋ˆŒ์ง€์™€ ๊ฐ™์€ ์—ฌ๋Ÿฌ ๊ทœ์น™๋“ค์ด ์žˆ์Šต๋‹ˆ๋‹ค (ํ† ํฐํ™”์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ํ† ํฌ๋‚˜์ด์ € ์š”์•ฝ์„ ์ฐธ์กฐํ•˜์„ธ์š”). ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์ ์€ ๋ชจ๋ธ์ด ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ๊ณผ ๋™์ผํ•œ ํ† ํฐํ™” ๊ทœ์น™์„ ์‚ฌ์šฉํ•˜๋„๋ก ๋™์ผํ•œ ๋ชจ๋ธ ์ด๋ฆ„์œผ๋กœ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ธ์Šคํ„ด์Šคํ™”ํ•ด์•ผ ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

[AutoTokenizer]๋กœ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋กœ๋“œํ•˜์„ธ์š”:

>>> from transformers import AutoTokenizer

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)

ํ…์ŠคํŠธ๋ฅผ ํ† ํฌ๋‚˜์ด์ €์— ์ „๋‹ฌํ•˜์„ธ์š”:

>>> encoding = tokenizer("We are very happy to show you the ๐Ÿค— Transformers library.")
>>> print(encoding)
{'input_ids': [101, 11312, 10320, 12495, 19308, 10114, 11391, 10855, 10103, 100, 58263, 13299, 119, 102],
 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}

ํ† ํฌ๋‚˜์ด์ €๋Š” ๋‹ค์Œ์„ ํฌํ•จํ•œ ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค:

  • input_ids: ํ† ํฐ์˜ ์ˆซ์ž ํ‘œํ˜„.
  • attention_mask: ์–ด๋–ค ํ† ํฐ์— ์ฃผ์˜๋ฅผ ๊ธฐ์šธ์—ฌ์•ผ ํ•˜๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.

ํ† ํฌ๋‚˜์ด์ €๋Š” ์ž…๋ ฅ์„ ๋ฆฌ์ŠคํŠธ ํ˜•ํƒœ๋กœ๋„ ๋ฐ›์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํ…์ŠคํŠธ๋ฅผ ํŒจ๋”ฉํ•˜๊ณ  ์ž˜๋ผ๋‚ด์–ด ์ผ์ •ํ•œ ๊ธธ์ด์˜ ๋ฌถ์Œ์„ ๋ฐ˜ํ™˜ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> pt_batch = tokenizer(
...     ["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."],
...     padding=True,
...     truncation=True,
...     max_length=512,
...     return_tensors="pt",
... )
>>> tf_batch = tokenizer(
...     ["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."],
...     padding=True,
...     truncation=True,
...     max_length=512,
...     return_tensors="tf",
... )

์ „์ฒ˜๋ฆฌ ํŠœํ† ๋ฆฌ์–ผ์„ ์ฐธ์กฐํ•˜์‹œ๋ฉด ํ† ํฐํ™”์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…๊ณผ ํ•จ๊ป˜ ์ด๋ฏธ์ง€, ์˜ค๋””์˜ค์™€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž…๋ ฅ์„ ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ [AutoImageProcessor]์™€ [AutoFeatureExtractor], [AutoProcessor]์˜ ์‚ฌ์šฉ๋ฐฉ๋ฒ•๋„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

AutoModel [[automodel]]

๐Ÿค— Transformers๋Š” ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ์ธ์Šคํ„ด์Šค๋ฅผ ๊ฐ„๋‹จํ•˜๊ณ  ํ†ตํ•ฉ๋œ ๋ฐฉ๋ฒ•์œผ๋กœ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, [`AutoTokenizer`]์ฒ˜๋Ÿผ [`AutoModel`]์„ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์ฐจ์ด์ ์€ ๊ณผ์—…์— ์•Œ๋งž์€ [`AutoModel`]์„ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ (๋˜๋Š” ์‹œํ€€์Šค) ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ [`AutoModelForSequenceClassification`]์„ ๋กœ๋“œํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:
>>> from transformers import AutoModelForSequenceClassification

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(model_name)

[AutoModel] ํด๋ž˜์Šค์—์„œ ์ง€์›ํ•˜๋Š” ๊ณผ์—…์— ๋Œ€ํ•ด์„œ๋Š” ๊ณผ์—… ์š”์•ฝ์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์ด์ œ ์ „์ฒ˜๋ฆฌ๋œ ์ž…๋ ฅ ๋ฌถ์Œ์„ ์ง์ ‘ ๋ชจ๋ธ์— ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜์ฒ˜๋Ÿผ **๋ฅผ ์•ž์— ๋ถ™์—ฌ ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ํ’€์–ด์ฃผ๋ฉด ๋ฉ๋‹ˆ๋‹ค:

>>> pt_outputs = pt_model(**pt_batch)

๋ชจ๋ธ์˜ ์ตœ์ข… ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ถœ๋ ฅ์€ logits ์†์„ฑ์— ๋‹ด๊ฒจ์žˆ์Šต๋‹ˆ๋‹ค. logits์— softmax ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•˜์—ฌ ํ™•๋ฅ ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> from torch import nn

>>> pt_predictions = nn.functional.softmax(pt_outputs.logits, dim=-1)
>>> print(pt_predictions)
tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725],
        [0.2084, 0.1826, 0.1969, 0.1755, 0.2365]], grad_fn=<SoftmaxBackward0>)
๐Ÿค— Transformers๋Š” ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ์ธ์Šคํ„ด์Šค๋ฅผ ๊ฐ„๋‹จํ•˜๊ณ  ํ†ตํ•ฉ๋œ ๋ฐฉ๋ฒ•์œผ๋กœ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, [`AutoTokenizer`]์ฒ˜๋Ÿผ [`TFAutoModel`]์„ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์ฐจ์ด์ ์€ ๊ณผ์—…์— ์•Œ๋งž์€ [`TFAutoModel`]์„ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ (๋˜๋Š” ์‹œํ€€์Šค) ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ [`TFAutoModelForSequenceClassification`]์„ ๋กœ๋“œํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค:
>>> from transformers import TFAutoModelForSequenceClassification

>>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment"
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

[AutoModel] ํด๋ž˜์Šค์—์„œ ์ง€์›ํ•˜๋Š” ๊ณผ์—…์— ๋Œ€ํ•ด์„œ๋Š” ๊ณผ์—… ์š”์•ฝ์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์ด์ œ ์ „์ฒ˜๋ฆฌ๋œ ์ž…๋ ฅ ๋ฌถ์Œ์„ ์ง์ ‘ ๋ชจ๋ธ์— ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜์ฒ˜๋Ÿผ ๊ทธ๋Œ€๋กœ ํ…์„œ๋ฅผ ์ „๋‹ฌํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค:

>>> tf_outputs = tf_model(tf_batch)

๋ชจ๋ธ์˜ ์ตœ์ข… ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ถœ๋ ฅ์€ logits ์†์„ฑ์— ๋‹ด๊ฒจ์žˆ์Šต๋‹ˆ๋‹ค. logits์— softmax ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•˜์—ฌ ํ™•๋ฅ ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> import tensorflow as tf

>>> tf_predictions = tf.nn.softmax(tf_outputs.logits, axis=-1)
>>> tf_predictions  # doctest: +IGNORE_RESULT

๋ชจ๋“  ๐Ÿค— Transformers ๋ชจ๋ธ(PyTorch ๋˜๋Š” TensorFlow)์€ (softmax์™€ ๊ฐ™์€) ์ตœ์ข… ํ™œ์„ฑํ™” ํ•จ์ˆ˜ ์ด์ „์— ํ…์„œ๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค. ์™œ๋ƒํ•˜๋ฉด ์ตœ์ข… ํ™œ์„ฑํ™” ํ•จ์ˆ˜์˜ ์ถœ๋ ฅ์€ ์ข…์ข… ์†์‹ค ํ•จ์ˆ˜ ์ถœ๋ ฅ๊ณผ ๊ฒฐํ•ฉ๋˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ ์ถœ๋ ฅ์€ ํŠน์ˆ˜ํ•œ ๋ฐ์ดํ„ฐ ํด๋ž˜์Šค์ด๋ฏ€๋กœ IDE์—์„œ ์ž๋™ ์™„์„ฑ๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ ์ถœ๋ ฅ์€ ํŠœํ”Œ์ด๋‚˜ ๋”•์…”๋„ˆ๋ฆฌ์ฒ˜๋Ÿผ ๋™์ž‘ํ•˜๋ฉฐ (์ •์ˆ˜, ์Šฌ๋ผ์ด์Šค ๋˜๋Š” ๋ฌธ์ž์—ด๋กœ ์ธ๋ฑ์‹ฑ ๊ฐ€๋Šฅ), None์ธ ์†์„ฑ์€ ๋ฌด์‹œ๋ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ์ €์žฅํ•˜๊ธฐ [[save-a-model]]

๋ฏธ์„ธ์กฐ์ •๋œ ๋ชจ๋ธ์„ ํ† ํฌ๋‚˜์ด์ €์™€ ํ•จ๊ป˜ ์ €์žฅํ•˜๋ ค๋ฉด [`PreTrainedModel.save_pretrained`]๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”:
>>> pt_save_directory = "./pt_save_pretrained"
>>> tokenizer.save_pretrained(pt_save_directory)  # doctest: +IGNORE_RESULT
>>> pt_model.save_pretrained(pt_save_directory)

๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ฌ์šฉํ•˜๋ ค๋ฉด [PreTrainedModel.from_pretrained]๋กœ ๋ชจ๋ธ์„ ๋‹ค์‹œ ๋กœ๋“œํ•˜์„ธ์š”:

>>> pt_model = AutoModelForSequenceClassification.from_pretrained("./pt_save_pretrained")
๋ฏธ์„ธ์กฐ์ •๋œ ๋ชจ๋ธ์„ ํ† ํฌ๋‚˜์ด์ €์™€ ํ•จ๊ป˜ ์ €์žฅํ•˜๋ ค๋ฉด [`TFPreTrainedModel.save_pretrained`]๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”:
>>> tf_save_directory = "./tf_save_pretrained"
>>> tokenizer.save_pretrained(tf_save_directory)  # doctest: +IGNORE_RESULT
>>> tf_model.save_pretrained(tf_save_directory)

๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ฌ์šฉํ•˜๋ ค๋ฉด [TFPreTrainedModel.from_pretrained]๋กœ ๋ชจ๋ธ์„ ๋‹ค์‹œ ๋กœ๋“œํ•˜์„ธ์š”:

>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained("./tf_save_pretrained")

๐Ÿค— Transformers์˜ ๋ฉ‹์ง„ ๊ธฐ๋Šฅ ์ค‘ ํ•˜๋‚˜๋Š” ๋ชจ๋ธ์„ PyTorch ๋˜๋Š” TensorFlow ๋ชจ๋ธ๋กœ ์ €์žฅํ•ด๋’€๋‹ค๊ฐ€ ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ๋‹ค์‹œ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ๋Š” ์ ์ž…๋‹ˆ๋‹ค. from_pt ๋˜๋Š” from_tf ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•œ ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> from transformers import AutoModel

>>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory)
>>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True)
>>> from transformers import TFAutoModel

>>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory)
>>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True)

์ปค์Šคํ…€ ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๊ธฐ [[custom-model-builds]]

๋ชจ๋ธ์˜ ๊ตฌ์„ฑ ํด๋ž˜์Šค๋ฅผ ์ˆ˜์ •ํ•˜์—ฌ ๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (์€๋‹‰์ธต์ด๋‚˜ ์–ดํ…์…˜ ํ—ค๋“œ์˜ ์ˆ˜์™€ ๊ฐ™์€) ๋ชจ๋ธ์˜ ์†์„ฑ์€ ๊ตฌ์„ฑ์—์„œ ์ง€์ •๋˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ปค์Šคํ…€ ๊ตฌ์„ฑ ํด๋ž˜์Šค๋กœ ๋ชจ๋ธ์„ ๋งŒ๋“ค๋ฉด ์ฒ˜์Œ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ ์†์„ฑ์€ ๋ฌด์ž‘์œ„๋กœ ์ดˆ๊ธฐํ™”๋˜๋ฏ€๋กœ ์˜๋ฏธ ์žˆ๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป์œผ๋ ค๋ฉด ๋จผ์ € ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œ์ผœ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋จผ์ € [AutoConfig]๋ฅผ ๊ฐ€์ ธ์˜ค๊ณ  ์ˆ˜์ •ํ•˜๊ณ  ์‹ถ์€ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๋กœ๋“œํ•˜์„ธ์š”. [AutoConfig.from_pretrained] ๋‚ด๋ถ€์—์„œ (์–ดํ…์…˜ ํ—ค๋“œ ์ˆ˜์™€ ๊ฐ™์ด) ๋ณ€๊ฒฝํ•˜๋ ค๋Š” ์†์„ฑ๋ฅผ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> from transformers import AutoConfig

>>> my_config = AutoConfig.from_pretrained("distilbert-base-uncased", n_heads=12)
[`AutoModel.from_config`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ”๊พผ ๊ตฌ์„ฑ๋Œ€๋กœ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜์„ธ์š”:
>>> from transformers import AutoModel

>>> my_model = AutoModel.from_config(my_config)
[`TFAutoModel.from_config`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ”๊พผ ๊ตฌ์„ฑ๋Œ€๋กœ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜์„ธ์š”:
>>> from transformers import TFAutoModel

>>> my_model = TFAutoModel.from_config(my_config)

์ปค์Šคํ…€ ๊ตฌ์„ฑ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์ปค์Šคํ…€ ์•„ํ‚คํ…์ฒ˜ ๋งŒ๋“ค๊ธฐ ๊ฐ€์ด๋“œ๋ฅผ ํ™•์ธํ•˜์„ธ์š”.

Trainer - PyTorch์— ์ตœ์ ํ™”๋œ ํ›ˆ๋ จ ๋ฃจํ”„ [[trainer-a-pytorch-optimized-training-loop]]

๋ชจ๋“  ๋ชจ๋ธ์€ torch.nn.Module์ด๋ฏ€๋กœ ์ผ๋ฐ˜์ ์ธ ํ›ˆ๋ จ ๋ฃจํ”„์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ง์ ‘ ํ›ˆ๋ จ ๋ฃจํ”„๋ฅผ ์ž‘์„ฑํ•  ์ˆ˜๋„ ์žˆ์ง€๋งŒ, ๐Ÿค— Transformers๋Š” PyTorch๋ฅผ ์œ„ํ•œ [Trainer] ํด๋ž˜์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์ด ํด๋ž˜์Šค์—๋Š” ๊ธฐ๋ณธ ํ›ˆ๋ จ ๋ฃจํ”„๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์œผ๋ฉฐ ๋ถ„์‚ฐ ํ›ˆ๋ จ, ํ˜ผํ•ฉ ์ •๋ฐ€๋„ ๋“ฑ๊ณผ ๊ฐ™์€ ๊ธฐ๋Šฅ์„ ์ถ”๊ฐ€๋กœ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๊ณผ์—…์— ๋”ฐ๋ผ ๋‹ค๋ฅด์ง€๋งŒ ์ผ๋ฐ˜์ ์œผ๋กœ [Trainer]์— ๋‹ค์Œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค:

  1. [PreTrainedModel] ๋˜๋Š” torch.nn.Module๋กœ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค:

    >>> from transformers import AutoModelForSequenceClassification
    
    >>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
    
  2. [TrainingArguments]๋Š” ํ•™์Šต๋ฅ , ๋ฐฐ์น˜ ํฌ๊ธฐ, ํ›ˆ๋ จํ•  ์—ํฌํฌ ์ˆ˜์™€ ๊ฐ™์€ ๋ชจ๋ธ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค. ํ›ˆ๋ จ ์ธ์ž๋ฅผ ์ง€์ •ํ•˜์ง€ ์•Š์œผ๋ฉด ๊ธฐ๋ณธ๊ฐ’์ด ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค:

    >>> from transformers import TrainingArguments
    
    >>> training_args = TrainingArguments(
    ...     output_dir="path/to/save/folder/",
    ...     learning_rate=2e-5,
    ...     per_device_train_batch_size=8,
    ...     per_device_eval_batch_size=8,
    ...     num_train_epochs=2,
    ... )
    
  3. ํ† ํฌ๋‚˜์ด์ €, ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ, ํŠน์ง• ์ถ”์ถœ๊ธฐ(feature extractor) ๋˜๋Š” ํ”„๋กœ์„ธ์„œ์™€ ์ „์ฒ˜๋ฆฌ ํด๋ž˜์Šค๋ฅผ ๋กœ๋“œํ•˜์„ธ์š”:

    >>> from transformers import AutoTokenizer
    
    >>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
    
  4. ๋ฐ์ดํ„ฐ์…‹์„ ๋กœ๋“œํ•˜์„ธ์š”:

    >>> from datasets import load_dataset
    
    >>> dataset = load_dataset("rotten_tomatoes")  # doctest: +IGNORE_RESULT
    
  5. ๋ฐ์ดํ„ฐ์…‹์„ ํ† ํฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜์„ธ์š”:

    >>> def tokenize_dataset(dataset):
    ...     return tokenizer(dataset["text"])
    

    ๊ทธ๋ฆฌ๊ณ  [~datasets.Dataset.map]๋กœ ๋ฐ์ดํ„ฐ์…‹ ์ „์ฒด์— ์ ์šฉํ•˜์„ธ์š”:

    >>> dataset = dataset.map(tokenize_dataset, batched=True)
    
  6. [DataCollatorWithPadding]์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ์…‹์˜ ํ‘œ๋ณธ ๋ฌถ์Œ์„ ๋งŒ๋“œ์„ธ์š”:

    >>> from transformers import DataCollatorWithPadding
    
    >>> data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
    

์ด์ œ ์œ„์˜ ๋ชจ๋“  ํด๋ž˜์Šค๋ฅผ [Trainer]๋กœ ๋ชจ์œผ์„ธ์š”:

>>> from transformers import Trainer

>>> trainer = Trainer(
...     model=model,
...     args=training_args,
...     train_dataset=dataset["train"],
...     eval_dataset=dataset["test"],
...     tokenizer=tokenizer,
...     data_collator=data_collator,
... )  # doctest: +SKIP

์ค€๋น„๊ฐ€ ๋˜์—ˆ์œผ๋ฉด [~Trainer.train]์„ ํ˜ธ์ถœํ•˜์—ฌ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜์„ธ์š”:

>>> trainer.train()  # doctest: +SKIP

๋ฒˆ์—ญ์ด๋‚˜ ์š”์•ฝ๊ณผ ๊ฐ™์ด ์‹œํ€€์Šค-์‹œํ€€์Šค ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ณผ์—…์—๋Š” [Seq2SeqTrainer] ๋ฐ [Seq2SeqTrainingArguments] ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”.

[Trainer] ๋‚ด์˜ ๋ฉ”์„œ๋“œ๋ฅผ ์„œ๋ธŒํด๋ž˜์Šคํ™”ํ•˜์—ฌ ํ›ˆ๋ จ ๋ฃจํ”„๋ฅผ ๋ฐ”๊ฟ€ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌ๋ฉด ์†์‹ค ํ•จ์ˆ˜, ์˜ตํ‹ฐ๋งˆ์ด์ €, ์Šค์ผ€์ค„๋Ÿฌ์™€ ๊ฐ™์€ ๊ธฐ๋Šฅ ๋˜ํ•œ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ๋ณ€๊ฒฝ ๊ฐ€๋Šฅํ•œ ๋ฉ”์†Œ๋“œ์— ๋Œ€ํ•ด์„œ๋Š” [Trainer] ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”.

ํ›ˆ๋ จ ๋ฃจํ”„๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์€ Callbacks๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. Callbacks๋กœ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ ํ†ตํ•ฉํ•˜๊ณ , ํ›ˆ๋ จ ๋ฃจํ”„๋ฅผ ์ฒดํฌํ•˜์—ฌ ์ง„ํ–‰ ์ƒํ™ฉ์„ ๋ณด๊ณ ๋ฐ›๊ฑฐ๋‚˜, ํ›ˆ๋ จ์„ ์กฐ๊ธฐ์— ์ค‘๋‹จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Callbacks์€ ํ›ˆ๋ จ ๋ฃจํ”„ ์ž์ฒด๋ฅผ ๋ฐ”๊พธ์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. ์†์‹ค ํ•จ์ˆ˜์™€ ๊ฐ™์€ ๊ฒƒ์„ ๋ฐ”๊พธ๋ ค๋ฉด [Trainer]๋ฅผ ์„œ๋ธŒํด๋ž˜์Šคํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

TensorFlow๋กœ ํ›ˆ๋ จ์‹œํ‚ค๊ธฐ [[train-with-tensorflow]]

๋ชจ๋“  ๋ชจ๋ธ์€ tf.keras.Model์ด๋ฏ€๋กœ Keras API๋ฅผ ํ†ตํ•ด TensorFlow์—์„œ ํ›ˆ๋ จ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿค— Transformers๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ์‰ฝ๊ฒŒ tf.data.Dataset ํ˜•ํƒœ๋กœ ์‰ฝ๊ฒŒ ๋กœ๋“œํ•  ์ˆ˜ ์žˆ๋Š” [~TFPreTrainedModel.prepare_tf_dataset] ๋ฉ”์†Œ๋“œ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์—, Keras์˜ compile ๋ฐ fit ๋ฉ”์†Œ๋“œ๋กœ ๋ฐ”๋กœ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. [TFPreTrainedModel] ๋˜๋Š” tf.keras.Model๋กœ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค:

    >>> from transformers import TFAutoModelForSequenceClassification
    
    >>> model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
    
  2. ํ† ํฌ๋‚˜์ด์ €, ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ, ํŠน์ง• ์ถ”์ถœ๊ธฐ(feature extractor) ๋˜๋Š” ํ”„๋กœ์„ธ์„œ์™€ ๊ฐ™์€ ์ „์ฒ˜๋ฆฌ ํด๋ž˜์Šค๋ฅผ ๋กœ๋“œํ•˜์„ธ์š”:

    >>> from transformers import AutoTokenizer
    
    >>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
    
  3. ๋ฐ์ดํ„ฐ์…‹์„ ํ† ํฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜์„ธ์š”:

    >>> def tokenize_dataset(dataset):
    ...     return tokenizer(dataset["text"])  # doctest: +SKIP
    
  4. [~datasets.Dataset.map]์„ ์‚ฌ์šฉํ•˜์—ฌ ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์— ํ† ํฐํ™” ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•˜๊ณ , ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ [~TFPreTrainedModel.prepare_tf_dataset]์— ์ „๋‹ฌํ•˜์„ธ์š”. ๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ ๋ณ€๊ฒฝํ•˜๊ฑฐ๋‚˜ ๋ฐ์ดํ„ฐ์…‹์„ ์„ž์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค:

    >>> dataset = dataset.map(tokenize_dataset)  # doctest: +SKIP
    >>> tf_dataset = model.prepare_tf_dataset(
    ...     dataset["train"], batch_size=16, shuffle=True, tokenizer=tokenizer
    ... )  # doctest: +SKIP
    
  5. ์ค€๋น„๋˜์—ˆ์œผ๋ฉด compile ๋ฐ fit๋ฅผ ํ˜ธ์ถœํ•˜์—ฌ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜์„ธ์š”. ๐Ÿค— Transformers์˜ ๋ชจ๋“  ๋ชจ๋ธ์€ ๊ณผ์—…๊ณผ ๊ด€๋ จ๋œ ๊ธฐ๋ณธ ์†์‹ค ํ•จ์ˆ˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฏ€๋กœ ๋ช…์‹œ์ ์œผ๋กœ ์ง€์ •ํ•˜์ง€ ์•Š์•„๋„ ๋ฉ๋‹ˆ๋‹ค:

    >>> from tensorflow.keras.optimizers import Adam
    
    >>> model.compile(optimizer=Adam(3e-5))  # No loss argument!
    >>> model.fit(tf_dataset)  # doctest: +SKIP
    

๋‹ค์Œ ๋‹จ๊ณ„๋Š” ๋ฌด์—‡์ธ๊ฐ€์š”? [[whats-next]]

๐Ÿค— Transformers ๋‘˜๋Ÿฌ๋ณด๊ธฐ๋ฅผ ๋ชจ๋‘ ์ฝ์œผ์…จ๋‹ค๋ฉด, ๊ฐ€์ด๋“œ๋ฅผ ์‚ดํŽด๋ณด๊ณ  ๋” ๊ตฌ์ฒด์ ์ธ ๊ฒƒ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด์„ธ์š”. ์ด๋ฅผํ…Œ๋ฉด ์ปค์Šคํ…€ ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐฉ๋ฒ•, ๊ณผ์—…์— ์•Œ๋งž๊ฒŒ ๋ชจ๋ธ์„ ๋ฏธ์„ธ์กฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•, ์Šคํฌ๋ฆฝํŠธ๋กœ ๋ชจ๋ธ ํ›ˆ๋ จํ•˜๋Š” ๋ฐฉ๋ฒ• ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿค— Transformers ํ•ต์‹ฌ ๊ฐœ๋…์— ๋Œ€ํ•ด ๋” ์•Œ์•„๋ณด๋ ค๋ฉด ์ปคํ”ผ ํ•œ ์ž” ๋“ค๊ณ  ๊ฐœ๋… ๊ฐ€์ด๋“œ๋ฅผ ์‚ดํŽด๋ณด์„ธ์š”!