# ๋‘˜๋Ÿฌ๋ณด๊ธฐ[[quick-tour]] [[open-in-colab]] ๐Ÿค— Transformer๋ฅผ ์‹œ์ž‘ํ•ด๋ด์š”! ๋‘˜๋Ÿฌ๋ณด๊ธฐ๋Š” ๊ฐœ๋ฐœ์ž์™€ ์ผ๋ฐ˜ ์‚ฌ์šฉ์ž ๋ชจ๋‘๋ฅผ ์œ„ํ•ด ์“ฐ์—ฌ์กŒ์Šต๋‹ˆ๋‹ค. [`pipeline`]์œผ๋กœ ์ถ”๋ก ํ•˜๋Š” ๋ฐฉ๋ฒ•, [AutoClass](./model_doc/auto)๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ๊ณผ ์ „์ฒ˜๋ฆฌ๊ธฐ๋ฅผ ์ ์žฌํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ PyTorch ๋˜๋Š” TensorFlow๋กœ ์‹ ์†ํ•˜๊ฒŒ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์„ ๋ฐฐ์šฐ๊ณ  ์‹ถ๋‹ค๋ฉด ํŠœํ† ๋ฆฌ์–ผ์ด๋‚˜ [course](https://huggingface.co/course/chapter1/1)์—์„œ ์—ฌ๊ธฐ ์†Œ๊ฐœ๋œ ๊ฐœ๋…์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…์„ ํ™•์ธํ•˜์‹œ๊ธธ ๊ถŒ์žฅํ•ฉ๋‹ˆ๋‹ค. ์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ๋ชจ๋‘ ์„ค์น˜๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ , ```bash !pip install transformers datasets ``` ์ข‹์•„ํ•˜๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋„ ์„ค์น˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ```bash pip install torch ``` ```bash pip install tensorflow ``` ## Pipeline (ํŒŒ์ดํ”„๋ผ์ธ) [`pipeline`]์€ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ์ถ”๋ก ํ•  ๋•Œ ์ œ์ผ ์‰ฌ์šด ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ์˜ ์ˆ˜๋งŽ์€ ํƒœ์Šคํฌ์— [`pipeline`]์„ ์ฆ‰์‹œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ์˜ ์˜ˆ์‹œ๋Š” ์•„๋ž˜ ํ‘œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”. | **ํƒœ์Šคํฌ** | **์„ค๋ช…** | **๋ชจ๋‹ฌ๋ฆฌํ‹ฐ** | **ํŒŒ์ดํ”„๋ผ์ธ ID** | |----------------|---------------------------------------------------------------------|------------------|-----------------------------------------------| | ํ…์ŠคํŠธ ๋ถ„๋ฅ˜ | ํ…์ŠคํŠธ์— ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ | ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) | pipeline(task="sentiment-analysis") | | ํ…์ŠคํŠธ ์ƒ์„ฑ | ์ฃผ์–ด์ง„ ๋ฌธ์ž์—ด ์ž…๋ ฅ๊ณผ ์ด์–ด์ง€๋Š” ํ…์ŠคํŠธ ์ƒ์„ฑํ•˜๊ธฐ | ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) | pipeline(task="text-generation") | | ๊ฐœ์ฒด๋ช… ์ธ์‹ | ๋ฌธ์ž์—ด์˜ ๊ฐ ํ† ํฐ๋งˆ๋‹ค ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ (์ธ๋ฌผ, ์กฐ์ง, ์žฅ์†Œ ๋“ฑ๋“ฑ) | ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) | pipeline(task="ner") | | ์งˆ์˜์‘๋‹ต | ์ฃผ์–ด์ง„ ๋ฌธ๋งฅ๊ณผ ์งˆ๋ฌธ์— ๋”ฐ๋ผ ์˜ฌ๋ฐ”๋ฅธ ๋Œ€๋‹ตํ•˜๊ธฐ | ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) | pipeline(task="question-answering") | | ๋นˆ์นธ ์ฑ„์šฐ๊ธฐ | ๋ฌธ์ž์—ด์˜ ๋นˆ์นธ์— ์•Œ๋งž์€ ํ† ํฐ ๋งž์ถ”๊ธฐ | ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) | pipeline(task="fill-mask") | | ์š”์•ฝ | ํ…์ŠคํŠธ๋‚˜ ๋ฌธ์„œ๋ฅผ ์š”์•ฝํ•˜๊ธฐ | ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) | pipeline(task="summarization") | | ๋ฒˆ์—ญ | ํ…์ŠคํŠธ๋ฅผ ํ•œ ์–ธ์–ด์—์„œ ๋‹ค๋ฅธ ์–ธ์–ด๋กœ ๋ฒˆ์—ญํ•˜๊ธฐ | ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ(NLP) | pipeline(task="translation") | | ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ | ์ด๋ฏธ์ง€์— ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ | ์ปดํ“จํ„ฐ ๋น„์ „(CV) | pipeline(task="image-classification") | | ์ด๋ฏธ์ง€ ๋ถ„ํ•  | ์ด๋ฏธ์ง€์˜ ํ”ฝ์…€๋งˆ๋‹ค ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ(์‹œ๋งจํ‹ฑ, ํŒŒ๋†‰ํ‹ฑ ๋ฐ ์ธ์Šคํ„ด์Šค ๋ถ„ํ•  ํฌํ•จ) | ์ปดํ“จํ„ฐ ๋น„์ „(CV) | pipeline(task="image-segmentation") | | ๊ฐ์ฒด ํƒ์ง€ | ์ด๋ฏธ์ง€ ์† ๊ฐ์ฒด์˜ ๊ฒฝ๊ณ„ ์ƒ์ž๋ฅผ ๊ทธ๋ฆฌ๊ณ  ํด๋ž˜์Šค๋ฅผ ์˜ˆ์ธกํ•˜๊ธฐ | ์ปดํ“จํ„ฐ ๋น„์ „(CV) | pipeline(task="object-detection") | | ์˜ค๋””์˜ค ๋ถ„๋ฅ˜ | ์˜ค๋””์˜ค ํŒŒ์ผ์— ์•Œ๋งž์€ ๋ผ๋ฒจ ๋ถ™์ด๊ธฐ | ์˜ค๋””์˜ค | pipeline(task="audio-classification") | | ์ž๋™ ์Œ์„ฑ ์ธ์‹ | ์˜ค๋””์˜ค ํŒŒ์ผ ์† ์Œ์„ฑ์„ ํ…์ŠคํŠธ๋กœ ๋ฐ”๊พธ๊ธฐ | ์˜ค๋””์˜ค | pipeline(task="automatic-speech-recognition") | | ์‹œ๊ฐ ์งˆ์˜์‘๋‹ต | ์ฃผ์–ด์ง„ ์ด๋ฏธ์ง€์™€ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์งˆ๋ฌธ์— ๋”ฐ๋ผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋Œ€๋‹ตํ•˜๊ธฐ | ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ | pipeline(task="vqa") | ๋จผ์ € [`pipeline`]์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ค์–ด ์ ์šฉํ•  ํƒœ์Šคํฌ๋ฅผ ๊ณ ๋ฅด์„ธ์š”. ์œ„ ํƒœ์Šคํฌ๋“ค์€ ๋ชจ๋‘ [`pipeline`]์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ณ , ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ์˜ ์ „์ฒด ๋ชฉ๋ก์„ ๋ณด๋ ค๋ฉด [pipeline API ๋ ˆํผ๋Ÿฐ์Šค](./main_classes/pipelines)๋ฅผ ํ™•์ธํ•ด์ฃผ์„ธ์š”. ๊ฐ„๋‹จํ•œ ์˜ˆ์‹œ๋กœ ๊ฐ์ • ๋ถ„์„ ํƒœ์Šคํฌ์— [`pipeline`]๋ฅผ ์ ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ```py >>> from transformers import pipeline >>> classifier = pipeline("sentiment-analysis") ``` [`pipeline`]์€ ๊ธฐ๋ณธ [์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ(์˜์–ด)](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)์™€ ๊ฐ์ • ๋ถ„์„์„ ํ•˜๊ธฐ ์œ„ํ•œ tokenizer๋ฅผ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  ์บ์‹œํ•ด๋†“์Šต๋‹ˆ๋‹ค. ์ด์ œ ์›ํ•˜๋Š” ํ…์ŠคํŠธ์— `classifier`๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> classifier("We are very happy to show you the ๐Ÿค— Transformers library.") [{'label': 'POSITIVE', 'score': 0.9998}] ``` ์ž…๋ ฅ์ด ์—ฌ๋Ÿฌ ๊ฐœ๋ผ๋ฉด, ์ž…๋ ฅ์„ [`pipeline`]์— ๋ฆฌ์ŠคํŠธ๋กœ ์ „๋‹ฌํ•ด์„œ ๋”•์…”๋„ˆ๋ฆฌ๋กœ ๋œ ๋ฆฌ์ŠคํŠธ๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> results = classifier(["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."]) >>> for result in results: ... print(f"label: {result['label']}, with score: {round(result['score'], 4)}") label: POSITIVE, with score: 0.9998 label: NEGATIVE, with score: 0.5309 ``` [`pipeline`]์€ ํŠน์ • ํƒœ์Šคํฌ์šฉ ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์ „๋ถ€ ์ˆœํšŒํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž๋™ ์Œ์„ฑ ์ธ์‹ ํƒœ์Šคํฌ์— ์ ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ```py >>> import torch >>> from transformers import pipeline >>> speech_recognizer = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h") ``` ์ด์ œ ์ˆœํšŒํ•  ์˜ค๋””์˜ค ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์ ์žฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. (์ž์„ธํ•œ ๋‚ด์šฉ์€ ๐Ÿค— Datasets [์‹œ์ž‘ํ•˜๊ธฐ](https://huggingface.co/docs/datasets/quickstart#audio)๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”) [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) ๋ฐ์ดํ„ฐ์…‹๋กœ ํ•ด๋ณผ๊นŒ์š”? ```py >>> from datasets import load_dataset, Audio >>> dataset = load_dataset("PolyAI/minds14", name="en-US", split="train") # doctest: +IGNORE_RESULT ``` ๋ฐ์ดํ„ฐ์…‹์˜ ์ƒ˜ํ”Œ๋ง ๋ ˆ์ดํŠธ๊ฐ€ [`facebook/wav2vec2-base-960h`](https://huggingface.co/facebook/wav2vec2-base-960h)์˜ ํ›ˆ๋ จ ๋‹น์‹œ ์ƒ˜ํ”Œ๋ง ๋ ˆ์ดํŠธ์™€ ์ผ์น˜ํ•ด์•ผ๋งŒ ํ•ฉ๋‹ˆ๋‹ค. ```py >>> dataset = dataset.cast_column("audio", Audio(sampling_rate=speech_recognizer.feature_extractor.sampling_rate)) ``` ์˜ค๋””์˜ค ํŒŒ์ผ์€ `"audio"` ์—ด์„ ํ˜ธ์ถœํ•  ๋•Œ ์ž๋™์œผ๋กœ ์ ์žฌ๋˜๊ณ  ๋‹ค์‹œ ์ƒ˜ํ”Œ๋ง๋ฉ๋‹ˆ๋‹ค. ์ฒ˜์Œ 4๊ฐœ ์ƒ˜ํ”Œ์—์„œ ์Œ์„ฑ์„ ์ถ”์ถœํ•˜์—ฌ ํŒŒ์ดํ”„๋ผ์ธ์— ๋ฆฌ์ŠคํŠธ ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ```py >>> result = speech_recognizer(dataset[:4]["audio"]) >>> print([d["text"] for d in result]) ['I WOULD LIKE TO SET UP A JOINT ACCOUNT WITH MY PARTNER HOW DO I PROCEED WITH DOING THAT', "FODING HOW I'D SET UP A JOIN TO HET WITH MY WIFE AND WHERE THE AP MIGHT BE", "I I'D LIKE TOY SET UP A JOINT ACCOUNT WITH MY PARTNER I'M NOT SEEING THE OPTION TO DO IT ON THE AP SO I CALLED IN TO GET SOME HELP CAN I JUST DO IT OVER THE PHONE WITH YOU AND GIVE YOU THE INFORMATION OR SHOULD I DO IT IN THE AP AND I'M MISSING SOMETHING UQUETTE HAD PREFERRED TO JUST DO IT OVER THE PHONE OF POSSIBLE THINGS", 'HOW DO I THURN A JOIN A COUNT'] ``` (์Œ์„ฑ์ด๋‚˜ ๋น„์ „์ฒ˜๋Ÿผ) ์ž…๋ ฅ์ด ํฐ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹์˜ ๊ฒฝ์šฐ, ๋ฉ”๋ชจ๋ฆฌ์— ์ ์žฌ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ๋ฆฌ์ŠคํŠธ ๋Œ€์‹  ์ œ๋„ˆ๋ ˆ์ดํ„ฐ๋กœ ์ž…๋ ฅ์„ ๋ชจ๋‘ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ [pipeline API ๋ ˆํผ๋Ÿฐ์Šค](./main_classes/pipelines)๋ฅผ ํ™•์ธํ•ด์ฃผ์„ธ์š”. ### ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ์ด๋‚˜ tokenizer ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•[[use-another-model-and-tokenizer-in-the-pipeline]] [`pipeline`]์€ [Hub](https://huggingface.co/models) ์† ๋ชจ๋“  ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์–ด, ์–ผ๋งˆ๋“ ์ง€ [`pipeline`]์„ ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ์€๋Œ€๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ๋ฅผ ๋‹ค๋ฃฐ ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์„ ๋งŒ๋“œ๋ ค๋ฉด, Hub์˜ ํƒœ๊ทธ๋กœ ์ ์ ˆํ•œ ๋ชจ๋ธ์„ ์ฐพ์•„๋ณด์„ธ์š”. ์ƒ์œ„ ๊ฒ€์ƒ‰ ๊ฒฐ๊ณผ๋กœ ๋œฌ ๊ฐ์ • ๋ถ„์„์„ ์œ„ํ•ด ํŒŒ์ธํŠœ๋‹๋œ ๋‹ค๊ตญ์–ด [BERT ๋ชจ๋ธ](https://huggingface.co/nlptown/bert-base-multilingual-uncased-sentiment)์ด ํ”„๋ž‘์Šค์–ด๋ฅผ ์ง€์›ํ•˜๋Š”๊ตฐ์š”. ```py >>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment" ``` [`AutoModelForSequenceClassification`]๊ณผ [`AutoTokenizer`]๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ์—ฐ๊ด€๋œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. (`AutoClass`์— ๋Œ€ํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ ์„น์…˜์—์„œ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค) ```py >>> from transformers import AutoTokenizer, AutoModelForSequenceClassification >>> model = AutoModelForSequenceClassification.from_pretrained(model_name) >>> tokenizer = AutoTokenizer.from_pretrained(model_name) ``` [`TFAutoModelForSequenceClassification`]๊ณผ [`AutoTokenizer`]๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ์—ฐ๊ด€๋œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. (`TFAutoClass`์— ๋Œ€ํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ ์„น์…˜์—์„œ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค) ```py >>> from transformers import AutoTokenizer, TFAutoModelForSequenceClassification >>> model = TFAutoModelForSequenceClassification.from_pretrained(model_name) >>> tokenizer = AutoTokenizer.from_pretrained(model_name) ``` [`pipeline`]์—์„œ ์‚ฌ์šฉํ•  ๋ชจ๋ธ๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ์ด์ œ (๊ฐ์ • ๋ถ„์„๊ธฐ์ธ) `classifier`๋ฅผ ํ”„๋ž‘์Šค์–ด ํ…์ŠคํŠธ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer) >>> classifier("Nous sommes trรจs heureux de vous prรฉsenter la bibliothรจque ๐Ÿค— Transformers.") [{'label': '5 stars', 'score': 0.7273}] ``` ํ•˜๊ณ ์‹ถ์€ ๊ฒƒ์— ์ ์šฉํ•  ๋งˆ๋•…ํ•œ ๋ชจ๋ธ์ด ์—†๋‹ค๋ฉด, ๊ฐ€์ง„ ๋ฐ์ดํ„ฐ๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋ฐฉ๋ฒ•์€ [ํŒŒ์ธํŠœ๋‹ ํŠœํ† ๋ฆฌ์–ผ](./training)์„ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”. ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ํŒŒ์ธํŠœ๋‹์„ ๋งˆ์น˜์…จ์œผ๋ฉด, ๋ˆ„๊ตฌ๋‚˜ ๋จธ์‹ ๋Ÿฌ๋‹์„ ํ•  ์ˆ˜ ์žˆ๋„๋ก [๊ณต์œ ](./model_sharing)ํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•ด์ฃผ์„ธ์š”. ๐Ÿค— ## AutoClass ๋‚ด๋ถ€์ ์œผ๋กœ ๋“ค์–ด๊ฐ€๋ฉด ์œ„์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ [`pipeline`]์€ [`AutoModelForSequenceClassification`]๊ณผ [`AutoTokenizer`] ํด๋ž˜์Šค๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. [AutoClass](./model_doc/auto)๋ž€ ์ด๋ฆ„์ด๋‚˜ ๊ฒฝ๋กœ๋ฅผ ๋ฐ›์œผ๋ฉด ๊ทธ์— ์•Œ๋งž๋Š” ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๊ฐ€์ ธ์˜ค๋Š” '๋ฐ”๋กœ๊ฐ€๊ธฐ'๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ๋Š”๋ฐ์š”. ์›ํ•˜๋Š” ํƒœ์Šคํฌ์™€ ์ „์ฒ˜๋ฆฌ์— ์ ํ•ฉํ•œ `AutoClass`๋ฅผ ๊ณ ๋ฅด๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ์ „์— ์‚ฌ์šฉํ–ˆ๋˜ ์˜ˆ์‹œ๋กœ ๋Œ์•„๊ฐ€์„œ `AutoClass`๋กœ [`pipeline`]๊ณผ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ### AutoTokenizer ํ† ํฌ๋‚˜์ด์ €๋Š” ์ „์ฒ˜๋ฆฌ๋ฅผ ๋‹ด๋‹นํ•˜๋ฉฐ, ํ…์ŠคํŠธ๋ฅผ ๋ชจ๋ธ์ด ๋ฐ›์„ ์ˆซ์ž ๋ฐฐ์—ด๋กœ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค. ํ† ํฐํ™” ๊ณผ์ •์—๋Š” ๋‹จ์–ด๋ฅผ ์–ด๋””์—์„œ ๋Š์„์ง€, ์–ผ๋งŒํผ ๋‚˜๋ˆŒ์ง€ ๋“ฑ์„ ํฌํ•จํ•œ ์—ฌ๋Ÿฌ ๊ทœ์น™์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ [ํ† ํฌ๋‚˜์ด์ € ์š”์•ฝ](./tokenizer_summary)๋ฅผ ํ™•์ธํ•ด์ฃผ์„ธ์š”. ์ œ์ผ ์ค‘์š”ํ•œ ์ ์€ ๋ชจ๋ธ์ด ํ›ˆ๋ จ๋์„ ๋•Œ์™€ ๋™์ผํ•œ ํ† ํฐํ™” ๊ทœ์น™์„ ์“ฐ๋„๋ก ๋™์ผํ•œ ๋ชจ๋ธ ์ด๋ฆ„์œผ๋กœ ํ† ํฌ๋‚˜์ด์ € ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. [`AutoTokenizer`]๋กœ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๊ณ , ```py >>> from transformers import AutoTokenizer >>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment" >>> tokenizer = AutoTokenizer.from_pretrained(model_name) ``` ํ† ํฌ๋‚˜์ด์ €์— ํ…์ŠคํŠธ๋ฅผ ์ œ๊ณตํ•˜์„ธ์š”. ```py >>> encoding = tokenizer("We are very happy to show you the ๐Ÿค— Transformers library.") >>> print(encoding) {'input_ids': [101, 11312, 10320, 12495, 19308, 10114, 11391, 10855, 10103, 100, 58263, 13299, 119, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]} ``` ๊ทธ๋Ÿฌ๋ฉด ๋‹ค์Œ์„ ํฌํ•จํ•œ ๋”•์…”๋„ˆ๋ฆฌ๊ฐ€ ๋ฐ˜ํ™˜๋ฉ๋‹ˆ๋‹ค. * [input_ids](./glossary#input-ids): ์ˆซ์ž๋กœ ํ‘œํ˜„๋œ ํ† ํฐ๋“ค * [attention_mask](.glossary#attention-mask): ์ฃผ์‹œํ•  ํ† ํฐ๋“ค ํ† ํฌ๋‚˜์ด์ €๋Š” ์ž…๋ ฅ์„ ๋ฆฌ์ŠคํŠธ๋กœ๋„ ๋ฐ›์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ํ…์ŠคํŠธ๋ฅผ ํŒจ๋“œํ•˜๊ฑฐ๋‚˜ ์ž˜๋ผ๋‚ด์–ด ๊ท ์ผํ•œ ๊ธธ์ด์˜ ๋ฐฐ์น˜๋ฅผ ๋ฐ˜ํ™˜ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> pt_batch = tokenizer( ... ["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."], ... padding=True, ... truncation=True, ... max_length=512, ... return_tensors="pt", ... ) ``` ```py >>> tf_batch = tokenizer( ... ["We are very happy to show you the ๐Ÿค— Transformers library.", "We hope you don't hate it."], ... padding=True, ... truncation=True, ... max_length=512, ... return_tensors="tf", ... ) ``` [์ „์ฒ˜๋ฆฌ](./preprocessing) ํŠœํ† ๋ฆฌ์–ผ์„ ๋ณด์‹œ๋ฉด ํ† ํฐํ™”์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์„ค๋ช…๊ณผ ํ•จ๊ป˜ ์ด๋ฏธ์ง€, ์˜ค๋””์˜ค์™€ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž…๋ ฅ์„ ์ „์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ [`AutoFeatureExtractor`]๊ณผ [`AutoProcessor`]์˜ ์‚ฌ์šฉ๋ฐฉ๋ฒ•๋„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ### AutoModel ๐Ÿค— Transformers๋กœ ์‚ฌ์ „ํ•™์Šต๋œ ์ธ์Šคํ„ด์Šค๋ฅผ ๊ฐ„๋‹จํ•˜๊ณ  ํ†ต์ผ๋œ ๋ฐฉ์‹์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌ๋ฉด [`AutoTokenizer`]์ฒ˜๋Ÿผ [`AutoModel`]๋„ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์ฐจ์ด์ ์€ ํƒœ์Šคํฌ์— ์ ํ•ฉํ•œ [`AutoModel`]์„ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ(๋˜๋Š” ์‹œํ€€์Šค) ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ [`AutoModelForSequenceClassification`]์„ ๋ถˆ๋Ÿฌ์™€์•ผ ํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import AutoModelForSequenceClassification >>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment" >>> pt_model = AutoModelForSequenceClassification.from_pretrained(model_name) ``` [`AutoModel`] ํด๋ž˜์Šค์—์„œ ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ๋“ค์€ [ํƒœ์Šคํฌ ์ •๋ฆฌ](./task_summary) ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”. ์ด์ œ ์ „์ฒ˜๋ฆฌ๋œ ์ž…๋ ฅ ๋ฐฐ์น˜๋ฅผ ๋ชจ๋ธ๋กœ ์ง์ ‘ ๋ณด๋‚ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜์ฒ˜๋Ÿผ `**`๋ฅผ ์•ž์— ๋ถ™์—ฌ ๋”•์…”๋„ˆ๋ฆฌ๋ฅผ ํ’€์–ด์ฃผ๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ```py >>> pt_outputs = pt_model(**pt_batch) ``` ๋ชจ๋ธ์˜ activation ๊ฒฐ๊ณผ๋Š” `logits` ์†์„ฑ์— ๋‹ด๊ฒจ์žˆ์Šต๋‹ˆ๋‹ค. `logits`์— Softmax ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•ด์„œ ํ™•๋ฅ  ํ˜•ํƒœ๋กœ ๋ฐ›์œผ์„ธ์š”. ```py >>> from torch import nn >>> pt_predictions = nn.functional.softmax(pt_outputs.logits, dim=-1) >>> print(pt_predictions) tensor([[0.0021, 0.0018, 0.0115, 0.2121, 0.7725], [0.2084, 0.1826, 0.1969, 0.1755, 0.2365]], grad_fn=) ``` ๐Ÿค— Transformers๋Š” ์‚ฌ์ „ํ•™์Šต๋œ ์ธ์Šคํ„ด์Šค๋ฅผ ๊ฐ„๋‹จํ•˜๊ณ  ํ†ต์ผ๋œ ๋ฐฉ์‹์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌ๋ฉด [`AutoTokenizer`]์ฒ˜๋Ÿผ [`TFAutoModel`]๋„ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์ฐจ์ด์ ์€ ํƒœ์Šคํฌ์— ์ ํ•ฉํ•œ [`TFAutoModel`]๋ฅผ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ํ…์ŠคํŠธ(๋˜๋Š” ์‹œํ€€์Šค) ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ [`TFAutoModelForSequenceClassification`]์„ ๋ถˆ๋Ÿฌ์™€์•ผ ํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import TFAutoModelForSequenceClassification >>> model_name = "nlptown/bert-base-multilingual-uncased-sentiment" >>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(model_name) ``` [`AutoModel`] ํด๋ž˜์Šค์—์„œ ์ง€์›ํ•˜๋Š” ํƒœ์Šคํฌ๋“ค์€ [ํƒœ์Šคํฌ ์ •๋ฆฌ](./task_summary) ๋ฌธ์„œ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”. ์ด์ œ ์ „์ฒ˜๋ฆฌ๋œ ์ž…๋ ฅ ๋ฐฐ์น˜๋ฅผ ๋ชจ๋ธ๋กœ ์ง์ ‘ ๋ณด๋‚ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋”•์…”๋„ˆ๋ฆฌ์˜ ํ‚ค๋ฅผ ํ…์„œ์— ์ง์ ‘ ๋„ฃ์–ด์ฃผ๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ```py >>> tf_outputs = tf_model(tf_batch) ``` ๋ชจ๋ธ์˜ activation ๊ฒฐ๊ณผ๋Š” `logits` ์†์„ฑ์— ๋‹ด๊ฒจ์žˆ์Šต๋‹ˆ๋‹ค. `logits`์— Softmax ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•ด์„œ ํ™•๋ฅ  ํ˜•ํƒœ๋กœ ๋ฐ›์œผ์„ธ์š”. ```py >>> import tensorflow as tf >>> tf_predictions = tf.nn.softmax(tf_outputs.logits, axis=-1) >>> tf_predictions # doctest: +IGNORE_RESULT ``` ๋ชจ๋“  (PyTorch ๋˜๋Š” TensorFlow) ๐Ÿค— Transformers ๋ชจ๋ธ์€ (softmax ๋“ฑ์˜) ์ตœ์ข… activation ํ•จ์ˆ˜ *์ด์ „์—* ํ…์„œ๋ฅผ ๋‚ด๋†“์Šต๋‹ˆ๋‹ค. ์™œ๋ƒํ•˜๋ฉด ์ตœ์ข… activation ํ•จ์ˆ˜๋ฅผ ์ข…์ข… loss ํ•จ์ˆ˜์™€ ๋™์ผ์‹œํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ ์ถœ๋ ฅ์€ ํŠน์ˆ˜ ๋ฐ์ดํ„ฐ ํด๋ž˜์Šค์ด๋ฏ€๋กœ ํ•ด๋‹น ์†์„ฑ์€ IDE์—์„œ ์ž๋™์œผ๋กœ ์™„์„ฑ๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ ์ถœ๋ ฅ์€ ํŠœํ”Œ ๋˜๋Š” (์ •์ˆ˜, ์Šฌ๋ผ์ด์Šค ๋˜๋Š” ๋ฌธ์ž์—ด๋กœ ์ธ๋ฑ์‹ฑํ•˜๋Š”) ๋”•์…”๋„ˆ๋ฆฌ ํ˜•ํƒœ๋กœ ์ฃผ์–ด์ง€๊ณ  ์ด๋Ÿฐ ๊ฒฝ์šฐ None์ธ ์†์„ฑ์€ ๋ฌด์‹œ๋ฉ๋‹ˆ๋‹ค. ### ๋ชจ๋ธ ์ €์žฅํ•˜๊ธฐ[[save-a-model]] ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•œ ๋’ค์—๋Š” [`PreTrainedModel.save_pretrained`]๋กœ ๋ชจ๋ธ์„ ํ† ํฌ๋‚˜์ด์ €์™€ ํ•จ๊ป˜ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> pt_save_directory = "./pt_save_pretrained" >>> tokenizer.save_pretrained(pt_save_directory) # doctest: +IGNORE_RESULT >>> pt_model.save_pretrained(pt_save_directory) ``` ๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ฌ์šฉํ•  ๋•Œ๋Š” [`PreTrainedModel.from_pretrained`]๋กœ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค. ```py >>> pt_model = AutoModelForSequenceClassification.from_pretrained("./pt_save_pretrained") ``` ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•œ ๋’ค์—๋Š” [`TFPreTrainedModel.save_pretrained`]๋กœ ๋ชจ๋ธ์„ ํ† ํฌ๋‚˜์ด์ €์™€ ํ•จ๊ป˜ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> tf_save_directory = "./tf_save_pretrained" >>> tokenizer.save_pretrained(tf_save_directory) # doctest: +IGNORE_RESULT >>> tf_model.save_pretrained(tf_save_directory) ``` ๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ฌ์šฉํ•  ๋•Œ๋Š” [`TFPreTrainedModel.from_pretrained`]๋กœ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค. ```py >>> tf_model = TFAutoModelForSequenceClassification.from_pretrained("./tf_save_pretrained") ``` ๐Ÿค— Transformers ๊ธฐ๋Šฅ ์ค‘ ํŠนํžˆ ์žฌ๋ฏธ์žˆ๋Š” ํ•œ ๊ฐ€์ง€๋Š” ๋ชจ๋ธ์„ ์ €์žฅํ•˜๊ณ  PyTorch๋‚˜ TensorFlow ๋ชจ๋ธ๋กœ ๋‹ค์‹œ ๋ถˆ๋Ÿฌ์˜ฌ ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค. 'from_pt' ๋˜๋Š” 'from_tf' ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ๋ชจ๋ธ์„ ๊ธฐ์กด๊ณผ ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ๋ณ€ํ™˜์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> from transformers import AutoModel >>> tokenizer = AutoTokenizer.from_pretrained(tf_save_directory) >>> pt_model = AutoModelForSequenceClassification.from_pretrained(tf_save_directory, from_tf=True) ``` ```py >>> from transformers import TFAutoModel >>> tokenizer = AutoTokenizer.from_pretrained(pt_save_directory) >>> tf_model = TFAutoModelForSequenceClassification.from_pretrained(pt_save_directory, from_pt=True) ``` ## ์ปค์Šคํ…€ ๋ชจ๋ธ ๊ตฌ์ถ•ํ•˜๊ธฐ[[custom-model-builds]] ๋ชจ๋ธ์˜ ๊ตฌ์„ฑ ํด๋ž˜์Šค๋ฅผ ์ˆ˜์ •ํ•˜์—ฌ ๋ชจ๋ธ์˜ ๊ตฌ์กฐ๋ฅผ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์€๋‹‰์ธต, ์–ดํ…์…˜ ํ—ค๋“œ ์ˆ˜์™€ ๊ฐ™์€ ๋ชจ๋ธ์˜ ์†์„ฑ์„ ๊ตฌ์„ฑ์—์„œ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค. ์ปค์Šคํ…€ ๊ตฌ์„ฑ ํด๋ž˜์Šค์—์„œ ๋ชจ๋ธ์„ ๋งŒ๋“ค๋ฉด ์ฒ˜์Œ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋ธ ์†์„ฑ์€ ๋žœ๋คํ•˜๊ฒŒ ์ดˆ๊ธฐํ™”๋˜๋ฏ€๋กœ ์˜๋ฏธ ์žˆ๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป์œผ๋ ค๋ฉด ๋จผ์ € ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ฌ ํ•„์š”๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋จผ์ € [`AutoConfig`]๋ฅผ ์ž„ํฌํŠธํ•˜๊ณ , ์ˆ˜์ •ํ•˜๊ณ  ์‹ถ์€ ์‚ฌ์ „ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๋ถˆ๋Ÿฌ์˜ค์„ธ์š”. [`AutoConfig.from_pretrained`]์—์„œ ์–ดํ…์…˜ ํ—ค๋“œ ์ˆ˜ ๊ฐ™์€ ์†์„ฑ์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ```py >>> from transformers import AutoConfig >>> my_config = AutoConfig.from_pretrained("distilbert-base-uncased", n_heads=12) ``` [`AutoModel.from_config`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ปค์Šคํ…€ ๊ตฌ์„ฑ๋Œ€๋กœ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import AutoModel >>> my_model = AutoModel.from_config(my_config) ``` [`TFAutoModel.from_config`]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ปค์Šคํ…€ ๊ตฌ์„ฑ๋Œ€๋กœ ๋ชจ๋ธ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import TFAutoModel >>> my_model = TFAutoModel.from_config(my_config) ``` ์ปค์Šคํ…€ ๊ตฌ์„ฑ์„ ์ž‘์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€ [์ปค์Šคํ…€ ์•„ํ‚คํ…์ฒ˜ ๋งŒ๋“ค๊ธฐ](./create_a_model) ๊ฐ€์ด๋“œ๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”. ## Trainer - PyTorch์— ์ตœ์ ํ™”๋œ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„[[trainer-a-pytorch-optimized-training-loop]] ๋ชจ๋“  ๋ชจ๋ธ์€ [`torch.nn.Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)์ด์–ด์„œ ๋Œ€๋‹ค์ˆ˜์˜ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž๊ฐ€ ์ง์ ‘ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ์ž‘์„ฑํ•ด๋„ ๋˜์ง€๋งŒ, ๐Ÿค— Transformers๋Š” PyTorch์šฉ [`Trainer`] ํด๋ž˜์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์ธ ํ›ˆ๋ จ ๋ฐ˜ํญ ๋ฃจํ”„๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ๊ณ , ๋ถ„์‚ฐ ํ›ˆ๋ จ์ด๋‚˜ ํ˜ผํ•ฉ ์ •๋ฐ€๋„ ๋“ฑ์˜ ์ถ”๊ฐ€ ๊ธฐ๋Šฅ๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ํƒœ์Šคํฌ์— ๋”ฐ๋ผ ๋‹ค๋ฅด์ง€๋งŒ, ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค์Œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ [`Trainer`]์— ์ „๋‹ฌํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. 1. [`PreTrainedModel`] ๋˜๋Š” [`torch.nn.Module`](https://pytorch.org/docs/stable/nn.html#torch.nn.Module)๋กœ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import AutoModelForSequenceClassification >>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased") ``` 2. [`TrainingArguments`]๋กœ ํ•™์Šต๋ฅ , ๋ฐฐ์น˜ ํฌ๊ธฐ๋‚˜ ํ›ˆ๋ จํ•  epoch ์ˆ˜์™€ ๊ฐ™์ด ๋ชจ๋ธ์˜ ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์กฐ์ •ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ํ›ˆ๋ จ ์ธ์ˆ˜๋ฅผ ์ „ํ˜€ ์ง€์ •ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import TrainingArguments >>> training_args = TrainingArguments( ... output_dir="path/to/save/folder/", ... learning_rate=2e-5, ... per_device_train_batch_size=8, ... per_device_eval_batch_size=8, ... num_train_epochs=2, ... ) ``` 3. ํ† ํฌ๋‚˜์ด์ €, ํŠน์ง•์ถ”์ถœ๊ธฐ(feature extractor), ์ „์ฒ˜๋ฆฌ๊ธฐ(processor) ํด๋ž˜์Šค ๋“ฑ์œผ๋กœ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") ``` 4. ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์ ์žฌํ•ฉ๋‹ˆ๋‹ค. ```py >>> from datasets import load_dataset >>> dataset = load_dataset("rotten_tomatoes") # doctest: +IGNORE_RESULT ``` 5. ๋ฐ์ดํ„ฐ์…‹์„ ํ† ํฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ค๊ณ  [`~datasets.Dataset.map`]์œผ๋กœ ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์— ์ ์šฉ์‹œํ‚ต๋‹ˆ๋‹ค. ```py >>> def tokenize_dataset(dataset): ... return tokenizer(dataset["text"]) >>> dataset = dataset.map(tokenize_dataset, batched=True) ``` 6. [`DataCollatorWithPadding`]๋กœ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ๋ถ€ํ„ฐ ํ‘œ๋ณธ์œผ๋กœ ์‚ผ์„ ๋ฐฐ์น˜๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ```py >>> from transformers import DataCollatorWithPadding >>> data_collator = DataCollatorWithPadding(tokenizer=tokenizer) ``` ์ด์ œ ์œ„์˜ ๋ชจ๋“  ํด๋ž˜์Šค๋ฅผ [`Trainer`]๋กœ ๋ชจ์œผ์„ธ์š”. ```py >>> from transformers import Trainer >>> trainer = Trainer( ... model=model, ... args=training_args, ... train_dataset=dataset["train"], ... eval_dataset=dataset["test"], ... tokenizer=tokenizer, ... data_collator=data_collator, ... ) # doctest: +SKIP ``` ์ค€๋น„๋˜์—ˆ์œผ๋ฉด [`~Trainer.train`]์œผ๋กœ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜์„ธ์š”. ```py >>> trainer.train() # doctest: +SKIP ``` sequence-to-sequence ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” (๋ฒˆ์—ญ์ด๋‚˜ ์š”์•ฝ ๊ฐ™์€) ํƒœ์Šคํฌ์˜ ๊ฒฝ์šฐ [`Seq2SeqTrainer`]์™€ [`Seq2SeqTrainingArguments`] ํด๋ž˜์Šค๋ฅผ ๋Œ€์‹  ์‚ฌ์šฉํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. [`Trainer`] ๋‚ด๋ถ€์˜ ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ ์ƒ์†(subclassing)ํ•ด์„œ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ๊ฐœ์กฐํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌ๋ฉด loss ํ•จ์ˆ˜, optimizer, scheduler ๋“ฑ์˜ ๊ธฐ๋Šฅ๋„ ๊ฐœ์กฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด๋–ค ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ ์ƒ์†ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด๋ ค๋ฉด [`Trainer`]๋ฅผ ์ฐธ๊ณ ํ•˜์„ธ์š”. ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ๊ฐœ์กฐํ•˜๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์€ [Callbacks](./main_classes/callbacks)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. Callbacks๋กœ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ ํ†ตํ•ฉํ•˜๊ณ , ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„๋ฅผ ์ˆ˜์‹œ๋กœ ์ฒดํฌํ•˜์—ฌ ์ง„ํ–‰ ์ƒํ™ฉ์„ ๋ณด๊ณ ๋ฐ›๊ฑฐ๋‚˜, ํ›ˆ๋ จ์„ ์กฐ๊ธฐ์— ์ค‘๋‹จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Callbacks์€ ํ›ˆ๋ จ ๋ฐ˜๋ณต ๋ฃจํ”„ ์ž์ฒด๋ฅผ ์ „ํ˜€ ์ˆ˜์ •ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋งŒ์•ฝ loss ํ•จ์ˆ˜ ๋“ฑ์„ ๊ฐœ์กฐํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด [`Trainer`]๋ฅผ ๊ตฌํ˜„ ์ƒ์†ํ•ด์•ผ๋งŒ ํ•ฉ๋‹ˆ๋‹ค. ## TensorFlow๋กœ ํ›ˆ๋ จ์‹œํ‚ค๊ธฐ[[train-with-tensorflow]] ๋ชจ๋“  ๋ชจ๋ธ์€ [`tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model)์ด์–ด์„œ [Keras](https://keras.io/) API๋ฅผ ํ†ตํ•ด TensorFlow์—์„œ ํ›ˆ๋ จ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿค— Transformers์—์„œ ๋ฐ์ดํ„ฐ์…‹๋ฅผ `tf.data.Dataset` ํ˜•ํƒœ๋กœ ์‰ฝ๊ฒŒ ์ ์žฌํ•  ์ˆ˜ ์žˆ๋Š” [`~TFPreTrainedModel.prepare_tf_dataset`] ๋ฉ”์„œ๋“œ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์—, Keras์˜ [`compile`](https://keras.io/api/models/model_training_apis/#compile-method) ๋ฐ [`fit`](https://www.tensorflow.org/api_docs/python/tf/keras/Model) ๋ฉ”์„œ๋“œ๋กœ ์ฆ‰์‹œ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. 1. [`TFPreTrainedModel`] ๋˜๋Š” [`tf.keras.Model`](https://www.tensorflow.org/api_docs/python/tf/keras/Model)๋กœ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import TFAutoModelForSequenceClassification >>> model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased") ``` 2. ํ† ํฌ๋‚˜์ด์ €, ํŠน์ง•์ถ”์ถœ๊ธฐ(feature extractor), ์ „์ฒ˜๋ฆฌ๊ธฐ(processor) ํด๋ž˜์Šค ๋“ฑ์œผ๋กœ ์ „์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ```py >>> from transformers import AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") ``` 3. ๋ฐ์ดํ„ฐ์…‹์„ ํ† ํฐํ™”ํ•˜๋Š” ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค. ```py >>> def tokenize_dataset(dataset): ... return tokenizer(dataset["text"]) # doctest: +SKIP ``` 4. [`~datasets.Dataset.map`]์œผ๋กœ ์ „์ฒด ๋ฐ์ดํ„ฐ์…‹์— ์œ„ ํ•จ์ˆ˜๋ฅผ ์ ์šฉ์‹œํ‚จ ๋‹ค์Œ, ๋ฐ์ดํ„ฐ์…‹๊ณผ ํ† ํฌ๋‚˜์ด์ €๋ฅผ [`~TFPreTrainedModel.prepare_tf_dataset`]๋กœ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ ๋ณ€๊ฒฝํ•ด๋ณด๊ฑฐ๋‚˜ ๋ฐ์ดํ„ฐ์…‹๋ฅผ ์„ž์–ด๋ด๋„ ์ข‹์Šต๋‹ˆ๋‹ค. ```py >>> dataset = dataset.map(tokenize_dataset) # doctest: +SKIP >>> tf_dataset = model.prepare_tf_dataset( ... dataset, batch_size=16, shuffle=True, tokenizer=tokenizer ... ) # doctest: +SKIP ``` 5. ์ค€๋น„๋˜์—ˆ์œผ๋ฉด `compile`๊ณผ `fit`์œผ๋กœ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•˜์„ธ์š”. ```py >>> from tensorflow.keras.optimizers import Adam >>> model.compile(optimizer=Adam(3e-5)) >>> model.fit(dataset) # doctest: +SKIP ``` ## ์ด์ œ ๋ฌด์–ผ ํ•˜๋ฉด ๋ ๊นŒ์š”?[[whats-next]] ๐Ÿค— Transformers ๋‘˜๋Ÿฌ๋ณด๊ธฐ๋ฅผ ๋ชจ๋‘ ์ฝ์œผ์…จ๋‹ค๋ฉด, ๊ฐ€์ด๋“œ๋ฅผ ํ†ตํ•ด ํŠน์ • ๊ธฐ์ˆ ์„ ๋ฐฐ์šธ ์ˆ˜ ์žˆ์–ด์š”. ์˜ˆ๋ฅผ ๋“ค์–ด ์ปค์Šคํ…€ ๋ชจ๋ธ์„ ์ž‘์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•, ํƒœ์Šคํฌ์šฉ ๋ชจ๋ธ์„ ํŒŒ์ธํŠœ๋‹ํ•˜๋Š” ๋ฐฉ๋ฒ•, ์Šคํฌ๋ฆฝํŠธ๋กœ ๋ชจ๋ธ์„ ํ›ˆ๋ จ์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ• ๋“ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๐Ÿค— Transformers์˜ ํ•ต์‹ฌ ๊ฐœ๋…์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค๋ฉด ์ปคํ”ผ ํ•œ ์ž”์„ ๋งˆ์‹  ๋’ค ๊ฐœ๋… ๊ฐ€์ด๋“œ๋ฅผ ์‚ดํŽด๋ณด์…”๋„ ์ข‹์Šต๋‹ˆ๋‹ค!