diff --git "a/fine-tune-whisper-non-streaming.ipynb" "b/fine-tune-whisper-non-streaming.ipynb" new file mode 100644--- /dev/null +++ "b/fine-tune-whisper-non-streaming.ipynb" @@ -0,0 +1,4643 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "75b58048-7d14-4fc6-8085-1fc08c81b4a6", + "metadata": { + "id": "75b58048-7d14-4fc6-8085-1fc08c81b4a6" + }, + "source": [ + "# Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers" + ] + }, + { + "cell_type": "markdown", + "id": "fbfa8ad5-4cdc-4512-9058-836cbbf65e1a", + "metadata": { + "id": "fbfa8ad5-4cdc-4512-9058-836cbbf65e1a" + }, + "source": [ + "In this Colab, we present a step-by-step guide on how to fine-tune Whisper \n", + "for any multilingual ASR dataset using Hugging Face 🤗 Transformers. This is a \n", + "more \"hands-on\" version of the accompanying [blog post](https://huggingface.co/blog/fine-tune-whisper). \n", + "For a more in-depth explanation of Whisper, the Common Voice dataset and the theory behind fine-tuning, the reader is advised to refer to the blog post." + ] + }, + { + "cell_type": "markdown", + "id": "afe0d503-ae4e-4aa7-9af4-dbcba52db41e", + "metadata": { + "id": "afe0d503-ae4e-4aa7-9af4-dbcba52db41e" + }, + "source": [ + "## Introduction" + ] + }, + { + "cell_type": "markdown", + "id": "9ae91ed4-9c3e-4ade-938e-f4c2dcfbfdc0", + "metadata": { + "id": "9ae91ed4-9c3e-4ade-938e-f4c2dcfbfdc0" + }, + "source": [ + "Whisper is a pre-trained model for automatic speech recognition (ASR) \n", + "published in [September 2022](https://openai.com/blog/whisper/) by the authors \n", + "Alec Radford et al. from OpenAI. Unlike many of its predecessors, such as \n", + "[Wav2Vec 2.0](https://arxiv.org/abs/2006.11477), which are pre-trained \n", + "on un-labelled audio data, Whisper is pre-trained on a vast quantity of \n", + "**labelled** audio-transcription data, 680,000 hours to be precise. \n", + "This is an order of magnitude more data than the un-labelled audio data used \n", + "to train Wav2Vec 2.0 (60,000 hours). What is more, 117,000 hours of this \n", + "pre-training data is multilingual ASR data. This results in checkpoints \n", + "that can be applied to over 96 languages, many of which are considered \n", + "_low-resource_.\n", + "\n", + "When scaled to 680,000 hours of labelled pre-training data, Whisper models \n", + "demonstrate a strong ability to generalise to many datasets and domains.\n", + "The pre-trained checkpoints achieve competitive results to state-of-the-art \n", + "ASR systems, with near 3% word error rate (WER) on the test-clean subset of \n", + "LibriSpeech ASR and a new state-of-the-art on TED-LIUM with 4.7% WER (_c.f._ \n", + "Table 8 of the [Whisper paper](https://cdn.openai.com/papers/whisper.pdf)).\n", + "The extensive multilingual ASR knowledge acquired by Whisper during pre-training \n", + "can be leveraged for other low-resource languages; through fine-tuning, the \n", + "pre-trained checkpoints can be adapted for specific datasets and languages \n", + "to further improve upon these results. We'll show just how Whisper can be fine-tuned \n", + "for low-resource languages in this Colab." + ] + }, + { + "cell_type": "markdown", + "id": "e59b91d6-be24-4b5e-bb38-4977ea143a72", + "metadata": { + "id": "e59b91d6-be24-4b5e-bb38-4977ea143a72" + }, + "source": [ + "
\n", + "\"Trulli\"\n", + "
Figure 1: Whisper model. The architecture \n", + "follows the standard Transformer-based encoder-decoder model. A \n", + "log-Mel spectrogram is input to the encoder. The last encoder \n", + "hidden states are input to the decoder via cross-attention mechanisms. The \n", + "decoder autoregressively predicts text tokens, jointly conditional on the \n", + "encoder hidden states and previously predicted tokens. Figure source: \n", + "OpenAI Whisper Blog.
\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "21b6316e-8a55-4549-a154-66d3da2ab74a", + "metadata": { + "id": "21b6316e-8a55-4549-a154-66d3da2ab74a" + }, + "source": [ + "The Whisper checkpoints come in five configurations of varying model sizes.\n", + "The smallest four are trained on either English-only or multilingual data.\n", + "The largest checkpoint is multilingual only. All nine of the pre-trained checkpoints \n", + "are available on the [Hugging Face Hub](https://huggingface.co/models?search=openai/whisper). The \n", + "checkpoints are summarised in the following table with links to the models on the Hub:\n", + "\n", + "| Size | Layers | Width | Heads | Parameters | English-only | Multilingual |\n", + "|--------|--------|-------|-------|------------|------------------------------------------------------|---------------------------------------------------|\n", + "| tiny | 4 | 384 | 6 | 39 M | [✓](https://huggingface.co/openai/whisper-tiny.en) | [✓](https://huggingface.co/openai/whisper-tiny.) |\n", + "| base | 6 | 512 | 8 | 74 M | [✓](https://huggingface.co/openai/whisper-base.en) | [✓](https://huggingface.co/openai/whisper-base) |\n", + "| small | 12 | 768 | 12 | 244 M | [✓](https://huggingface.co/openai/whisper-small.en) | [✓](https://huggingface.co/openai/whisper-small) |\n", + "| medium | 24 | 1024 | 16 | 769 M | [✓](https://huggingface.co/openai/whisper-medium.en) | [✓](https://huggingface.co/openai/whisper-medium) |\n", + "| large | 32 | 1280 | 20 | 1550 M | x | [✓](https://huggingface.co/openai/whisper-large) |\n", + "\n", + "For demonstration purposes, we'll fine-tune the multilingual version of the \n", + "[`\"small\"`](https://huggingface.co/openai/whisper-small) checkpoint with 244M params (~= 1GB). \n", + "As for our data, we'll train and evaluate our system on a low-resource language \n", + "taken from the [Common Voice](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0)\n", + "dataset. We'll show that with as little as 8 hours of fine-tuning data, we can achieve \n", + "strong performance in this language." + ] + }, + { + "cell_type": "markdown", + "id": "3a680dfc-cbba-4f6c-8a1f-e1a5ff3f123a", + "metadata": { + "id": "3a680dfc-cbba-4f6c-8a1f-e1a5ff3f123a" + }, + "source": [ + "------------------------------------------------------------------------\n", + "\n", + "\\\\({}^1\\\\) The name Whisper follows from the acronym “WSPSR”, which stands for “Web-scale Supervised Pre-training for Speech Recognition”." + ] + }, + { + "cell_type": "markdown", + "id": "b219c9dd-39b6-4a95-b2a1-3f547a1e7bc0", + "metadata": { + "id": "b219c9dd-39b6-4a95-b2a1-3f547a1e7bc0" + }, + "source": [ + "## Load Dataset" + ] + }, + { + "cell_type": "markdown", + "id": "674429c5-0ab4-4adf-975b-621bb69eca38", + "metadata": { + "id": "674429c5-0ab4-4adf-975b-621bb69eca38" + }, + "source": [ + "Using 🤗 Datasets, downloading and preparing data is extremely simple. \n", + "We can download and prepare the Common Voice splits in just one line of code. \n", + "\n", + "First, ensure you have accepted the terms of use on the Hugging Face Hub: [mozilla-foundation/common_voice_11_0](https://huggingface.co/datasets/mozilla-foundation/common_voice_11_0). Once you have accepted the terms, you will have full access to the dataset and be able to download the data locally.\n", + "\n", + "Since Hindi is very low-resource, we'll combine the `train` and `validation` \n", + "splits to give approximately 8 hours of training data. We'll use the 4 hours \n", + "of `test` data as our held-out test set:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "a2787582-554f-44ce-9f38-4180a5ed6b44", + "metadata": { + "id": "a2787582-554f-44ce-9f38-4180a5ed6b44" + }, + "outputs": [], + "source": [ + "# from datasets import load_dataset, DatasetDict\n", + "\n", + "# common_voice = DatasetDict()\n", + "\n", + "# common_voice[\"train\"] = load_dataset(\"mozilla-foundation/common_voice_11_0\", \"fi\", split=\"train+validation\", use_auth_token=True)\n", + "# common_voice[\"test\"] = load_dataset(\"mozilla-foundation/common_voice_11_0\", \"fi\", split=\"test\", use_auth_token=True)\n", + "\n", + "# print(common_voice)" + ] + }, + { + "cell_type": "markdown", + "id": "aaeb4d94-56de-4e31-8630-f7e87e1affeb", + "metadata": {}, + "source": [ + "Load multiple datasets" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "3be04966-dfa7-4667-88cd-3c36081c9e5b", + "metadata": {}, + "outputs": [], + "source": [ + "dataset_names = [\"mozilla-foundation/common_voice_11_0\", \"facebook/voxpopuli\", \"google/fleurs\"]\n", + "dataset_config_names = [\"fi\", \"fi\", \"fi_fi\"]\n", + "text_column_names = [\"sentence\", \"normalized_text\", \"raw_transcription\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "5e93c904-d6f6-42fa-9720-f19c58c45d8e", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/ubuntu/.local/lib/python3.8/site-packages/pandas/core/computation/expressions.py:20: UserWarning: Pandas requires version '2.7.3' or newer of 'numexpr' (version '2.7.1' currently installed).\n", + " from pandas.core.computation.check import NUMEXPR_INSTALLED\n" + ] + } + ], + "source": [ + "from datasets import Audio, interleave_datasets, IterableDataset, load_dataset\n", + "from typing import List, Optional\n", + "\n", + "def load_multiple_streaming_datasets(\n", + " dataset_names: List,\n", + " dataset_config_names: List,\n", + " splits: Optional[List] = None,\n", + " text_column_names: Optional[List] = None,\n", + " sampling_rate: Optional[int] = 16000,\n", + " stopping_strategy: Optional[str] = \"all_exhausted\",\n", + " **kwargs\n", + ") -> IterableDataset:\n", + "\n", + " if len(dataset_names) != len(dataset_config_names):\n", + " raise ValueError(\n", + " f\"Ensure one config is passed for each dataset, got {len(dataset_names)} datasets and\"\n", + " f\" {len(dataset_config_names)} configs.\"\n", + " )\n", + "\n", + " if splits is not None and len(splits) != len(dataset_names):\n", + " raise ValueError(\n", + " f\"Ensure one split is passed for each dataset, got {len(dataset_names)} datasets and {len(splits)} splits.\"\n", + " )\n", + "\n", + " if text_column_names is not None and len(text_column_names) != len(dataset_names):\n", + " raise ValueError(\n", + " f\"Ensure one text column name is passed for each dataset, got {len(dataset_names)} datasets and\"\n", + " f\" {len(text_column_names)} text column names.\"\n", + " )\n", + "\n", + " splits = splits if splits is not None else [\"train\" for i in range(len(dataset_names))]\n", + " text_column_names = (\n", + " text_column_names if text_column_names is not None else [\"text\" for i in range(len(dataset_names))]\n", + " )\n", + "\n", + " all_datasets = []\n", + " # iterate over the datasets we want to interleave\n", + " for i, dataset_name in enumerate(dataset_names):\n", + " dataset = load_dataset(dataset_name, dataset_config_names[i], split=splits[i], streaming=True, **kwargs)\n", + " # resample to specified sampling rate\n", + " dataset = dataset.cast_column(\"audio\", Audio(sampling_rate))\n", + " # normalise columns to [\"audio\", \"sentence\"]\n", + " if text_column_names[i] != \"sentence\":\n", + " dataset = dataset.rename_column(text_column_names[i], \"sentence\")\n", + " dataset = dataset.remove_columns(set(dataset.features.keys()) - set([\"audio\", \"sentence\"]))\n", + " all_datasets.append(dataset)\n", + "\n", + " interleaved_dataset = interleave_datasets(all_datasets, stopping_strategy=stopping_strategy)\n", + " return interleaved_dataset\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "cd0f1d04-5bad-47d5-988b-cbcd1160e645", + "metadata": {}, + "outputs": [], + "source": [ + "from transformers.models.whisper.english_normalizer import BasicTextNormalizer\n", + "\n", + "ds = load_multiple_streaming_datasets(dataset_names, dataset_config_names=dataset_config_names, text_column_names=text_column_names, use_auth_token=True)\n", + "ds_eval = load_multiple_streaming_datasets(dataset_names, dataset_config_names=dataset_config_names, text_column_names=text_column_names, splits=['test', 'test', 'test'], use_auth_token=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "5ca18d20-5c36-42de-8f15-c0bbf8754c5d", + "metadata": {}, + "outputs": [], + "source": [ + "# for i, sample in enumerate(ds):\n", + "# print(i, sample[\"sentence\"])\n", + "# if i == 1:\n", + "# break" + ] + }, + { + "cell_type": "markdown", + "id": "2d63b2d2-f68a-4d74-b7f1-5127f6d16605", + "metadata": { + "id": "2d63b2d2-f68a-4d74-b7f1-5127f6d16605" + }, + "source": [ + "## Prepare Feature Extractor, Tokenizer and Data" + ] + }, + { + "cell_type": "markdown", + "id": "601c3099-1026-439e-93e2-5635b3ba5a73", + "metadata": { + "id": "601c3099-1026-439e-93e2-5635b3ba5a73" + }, + "source": [ + "The ASR pipeline can be de-composed into three stages: \n", + "1) A feature extractor which pre-processes the raw audio-inputs\n", + "2) The model which performs the sequence-to-sequence mapping \n", + "3) A tokenizer which post-processes the model outputs to text format\n", + "\n", + "In 🤗 Transformers, the Whisper model has an associated feature extractor and tokenizer, \n", + "called [WhisperFeatureExtractor](https://huggingface.co/docs/transformers/main/model_doc/whisper#transformers.WhisperFeatureExtractor)\n", + "and [WhisperTokenizer](https://huggingface.co/docs/transformers/main/model_doc/whisper#transformers.WhisperTokenizer) \n", + "respectively.\n", + "\n", + "We'll go through details for setting-up the feature extractor and tokenizer one-by-one!" + ] + }, + { + "cell_type": "markdown", + "id": "560332eb-3558-41a1-b500-e83a9f695f84", + "metadata": { + "id": "560332eb-3558-41a1-b500-e83a9f695f84" + }, + "source": [ + "### Load WhisperFeatureExtractor" + ] + }, + { + "cell_type": "markdown", + "id": "32ec8068-0bd7-412d-b662-0edb9d1e7365", + "metadata": { + "id": "32ec8068-0bd7-412d-b662-0edb9d1e7365" + }, + "source": [ + "The Whisper feature extractor performs two operations:\n", + "1. Pads / truncates the audio inputs to 30s: any audio inputs shorter than 30s are padded to 30s with silence (zeros), and those longer that 30s are truncated to 30s\n", + "2. Converts the audio inputs to _log-Mel spectrogram_ input features, a visual representation of the audio and the form of the input expected by the Whisper model" + ] + }, + { + "cell_type": "markdown", + "id": "589d9ec1-d12b-4b64-93f7-04c63997da19", + "metadata": { + "id": "589d9ec1-d12b-4b64-93f7-04c63997da19" + }, + "source": [ + "
\n", + "\"Trulli\"\n", + "
Figure 2: Conversion of sampled audio array to log-Mel spectrogram.\n", + "Left: sampled 1-dimensional audio signal. Right: corresponding log-Mel spectrogram. Figure source:\n", + "Google SpecAugment Blog.\n", + "
" + ] + }, + { + "cell_type": "markdown", + "id": "b2ef54d5-b946-4c1d-9fdc-adc5d01b46aa", + "metadata": { + "id": "b2ef54d5-b946-4c1d-9fdc-adc5d01b46aa" + }, + "source": [ + "We'll load the feature extractor from the pre-trained checkpoint with the default values:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "bc77d7bb-f9e2-47f5-b663-30f7a4321ce5", + "metadata": { + "id": "bc77d7bb-f9e2-47f5-b663-30f7a4321ce5" + }, + "outputs": [], + "source": [ + "from transformers import WhisperFeatureExtractor\n", + "\n", + "feature_extractor = WhisperFeatureExtractor.from_pretrained(\"openai/whisper-large\")" + ] + }, + { + "cell_type": "markdown", + "id": "93748af7-b917-4ecf-a0c8-7d89077ff9cb", + "metadata": { + "id": "93748af7-b917-4ecf-a0c8-7d89077ff9cb" + }, + "source": [ + "### Load WhisperTokenizer" + ] + }, + { + "cell_type": "markdown", + "id": "2bc82609-a9fb-447a-a2af-99597c864029", + "metadata": { + "id": "2bc82609-a9fb-447a-a2af-99597c864029" + }, + "source": [ + "The Whisper model outputs a sequence of _token ids_. The tokenizer maps each of these token ids to their corresponding text string. For Hindi, we can load the pre-trained tokenizer and use it for fine-tuning without any further modifications. We simply have to \n", + "specify the target language and the task. These arguments inform the \n", + "tokenizer to prefix the language and task tokens to the start of encoded \n", + "label sequences:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "c7b07f9b-ae0e-4f89-98f0-0c50d432eab6", + "metadata": { + "id": "c7b07f9b-ae0e-4f89-98f0-0c50d432eab6", + "outputId": "5c004b44-86e7-4e00-88be-39e0af5eed69" + }, + "outputs": [], + "source": [ + "from transformers import WhisperTokenizer\n", + "\n", + "tokenizer = WhisperTokenizer.from_pretrained(\"openai/whisper-large\", language=\"Finnish\", task=\"transcribe\")" + ] + }, + { + "cell_type": "markdown", + "id": "d2ef23f3-f4a8-483a-a2dc-080a7496cb1b", + "metadata": { + "id": "d2ef23f3-f4a8-483a-a2dc-080a7496cb1b" + }, + "source": [ + "### Combine To Create A WhisperProcessor" + ] + }, + { + "cell_type": "markdown", + "id": "5ff67654-5a29-4bb8-a69d-0228946c6f8d", + "metadata": { + "id": "5ff67654-5a29-4bb8-a69d-0228946c6f8d" + }, + "source": [ + "To simplify using the feature extractor and tokenizer, we can _wrap_ \n", + "both into a single `WhisperProcessor` class. This processor object \n", + "inherits from the `WhisperFeatureExtractor` and `WhisperProcessor`, \n", + "and can be used on the audio inputs and model predictions as required. \n", + "In doing so, we only need to keep track of two objects during training: \n", + "the `processor` and the `model`:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "77d9f0c5-8607-4642-a8ac-c3ab2e223ea6", + "metadata": { + "id": "77d9f0c5-8607-4642-a8ac-c3ab2e223ea6" + }, + "outputs": [], + "source": [ + "from transformers import WhisperProcessor\n", + "\n", + "processor = WhisperProcessor.from_pretrained(\"openai/whisper-large\", language=\"Finnish\", task=\"transcribe\")" + ] + }, + { + "cell_type": "markdown", + "id": "381acd09-0b0f-4d04-9eb3-f028ac0e5f2c", + "metadata": { + "id": "381acd09-0b0f-4d04-9eb3-f028ac0e5f2c" + }, + "source": [ + "### Prepare Data" + ] + }, + { + "cell_type": "markdown", + "id": "89e12c2e-2f14-479b-987b-f0c75c881095", + "metadata": {}, + "source": [ + "Now we can write a function to prepare our data ready for the model:\n", + "1. We load and resample the audio data by calling `batch[\"audio\"]`. As explained above, 🤗 Datasets performs any necessary resampling operations on the fly.\n", + "2. We use the feature extractor to compute the log-Mel spectrogram input features from our 1-dimensional audio array.\n", + "3. We perform any optional pre-processing (lower-case or remove punctuation).\n", + "4. We encode the transcriptions to label ids through the use of the tokenizer." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "c085911c-a10a-41ef-8874-306e0503e9bb", + "metadata": {}, + "outputs": [], + "source": [ + "def preprocess_data(batch, normalizer, lower=False, punctuation=False):\n", + " audio = batch[\"audio\"]\n", + " # compute log-Mel input features from input audio array \n", + " batch[\"input_features\"] = processor.feature_extractor(audio[\"array\"], sampling_rate=audio[\"sampling_rate\"]).input_features[0]\n", + " # compute input length of audio sample in seconds\n", + " batch[\"input_length\"] = len(audio[\"array\"]) / audio[\"sampling_rate\"]\n", + " \n", + " # optional pre-processing steps\n", + " transcription = batch[\"sentence\"]\n", + " if lower:\n", + " transcription = transcription.lower()\n", + " if punctuation:\n", + " transcription = normalizer(transcription).strip()\n", + " \n", + " # encode target text to label ids\n", + " batch[\"labels\"] = processor.tokenizer(transcription).input_ids\n", + " \n", + " return batch" + ] + }, + { + "cell_type": "markdown", + "id": "8c960965-9fb6-466f-9dbd-c9d43e71d9d0", + "metadata": { + "id": "70b319fb-2439-4ef6-a70d-a47bf41c4a13" + }, + "source": [ + "We can apply the data preparation function to all of our training examples using dataset's `.map` method. The argument `num_proc` specifies how many CPU cores to use. Setting `num_proc` > 1 will enable multiprocessing. If the `.map` method hangs with multiprocessing, set `num_proc=1` and process the dataset sequentially." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "7b73ab39-ffaf-4b9e-86e5-782963c6134b", + "metadata": { + "id": "7b73ab39-ffaf-4b9e-86e5-782963c6134b" + }, + "outputs": [], + "source": [ + "normalizer = BasicTextNormalizer()\n", + "\n", + "ds = ds.map(lambda x: preprocess_data(x, normalizer, lower=True, punctuation=False))\n", + "ds_eval = ds_eval.map(lambda x: preprocess_data(x, normalizer, lower=True, punctuation=False))" + ] + }, + { + "cell_type": "markdown", + "id": "54ce0fdb-7218-4a4d-b175-383980fec0df", + "metadata": {}, + "source": [ + "Finally, we filter any training data with audio samples longer than 30s. These samples would otherwise be truncated by the Whisper feature-extractor which could affect the stability of training. We define a function that returns `True` for samples that are less than 30s, and `False` for those that are longer:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "01cb25ef-4bb0-4325-9461-f59198acadf6", + "metadata": {}, + "outputs": [], + "source": [ + "max_input_length = 30.0\n", + "\n", + "def is_audio_in_length_range(length):\n", + " return length < max_input_length" + ] + }, + { + "cell_type": "markdown", + "id": "30e676a8-7ca8-4850-8c5d-5b2b00d13fba", + "metadata": {}, + "source": [ + "We apply our filter function to all samples of our training dataset through 🤗 Datasets' `.filter` method:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "333f7f6e-6053-4d3b-8924-c733c79b82ac", + "metadata": {}, + "outputs": [], + "source": [ + "ds = ds.filter(\n", + " is_audio_in_length_range,\n", + " input_columns=[\"input_length\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "263a5a58-0239-4a25-b0df-c625fc9c5810", + "metadata": { + "id": "263a5a58-0239-4a25-b0df-c625fc9c5810" + }, + "source": [ + "## Training and Evaluation" + ] + }, + { + "cell_type": "markdown", + "id": "a693e768-c5a6-453f-89a1-b601dcf7daf7", + "metadata": { + "id": "a693e768-c5a6-453f-89a1-b601dcf7daf7" + }, + "source": [ + "Now that we've prepared our data, we're ready to dive into the training pipeline. \n", + "The [🤗 Trainer](https://huggingface.co/transformers/master/main_classes/trainer.html?highlight=trainer)\n", + "will do much of the heavy lifting for us. All we have to do is:\n", + "\n", + "- Define a data collator: the data collator takes our pre-processed data and prepares PyTorch tensors ready for the model.\n", + "\n", + "- Evaluation metrics: during evaluation, we want to evaluate the model using the [word error rate (WER)](https://huggingface.co/metrics/wer) metric. We need to define a `compute_metrics` function that handles this computation.\n", + "\n", + "- Load a pre-trained checkpoint: we need to load a pre-trained checkpoint and configure it correctly for training.\n", + "\n", + "- Define the training configuration: this will be used by the 🤗 Trainer to define the training schedule.\n", + "\n", + "Once we've fine-tuned the model, we will evaluate it on the test data to verify that we have correctly trained it \n", + "to transcribe speech in Hindi." + ] + }, + { + "cell_type": "markdown", + "id": "8d230e6d-624c-400a-bbf5-fa660881df25", + "metadata": { + "id": "8d230e6d-624c-400a-bbf5-fa660881df25" + }, + "source": [ + "### Define a Data Collator" + ] + }, + { + "cell_type": "markdown", + "id": "04def221-0637-4a69-b242-d3f0c1d0ee78", + "metadata": { + "id": "04def221-0637-4a69-b242-d3f0c1d0ee78" + }, + "source": [ + "The data collator for a sequence-to-sequence speech model is unique in the sense that it \n", + "treats the `input_features` and `labels` independently: the `input_features` must be \n", + "handled by the feature extractor and the `labels` by the tokenizer.\n", + "\n", + "The `input_features` are already padded to 30s and converted to a log-Mel spectrogram \n", + "of fixed dimension by action of the feature extractor, so all we have to do is convert the `input_features`\n", + "to batched PyTorch tensors. We do this using the feature extractor's `.pad` method with `return_tensors=pt`.\n", + "\n", + "The `labels` on the other hand are un-padded. We first pad the sequences\n", + "to the maximum length in the batch using the tokenizer's `.pad` method. The padding tokens \n", + "are then replaced by `-100` so that these tokens are **not** taken into account when \n", + "computing the loss. We then cut the BOS token from the start of the label sequence as we \n", + "append it later during training.\n", + "\n", + "We can leverage the `WhisperProcessor` we defined earlier to perform both the \n", + "feature extractor and the tokenizer operations:" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "8326221e-ec13-4731-bb4e-51e5fc1486c5", + "metadata": { + "id": "8326221e-ec13-4731-bb4e-51e5fc1486c5" + }, + "outputs": [], + "source": [ + "import torch\n", + "\n", + "from dataclasses import dataclass\n", + "from typing import Any, Dict, List, Union\n", + "\n", + "@dataclass\n", + "class DataCollatorSpeechSeq2SeqWithPadding:\n", + " processor: Any\n", + "\n", + " def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:\n", + " # split inputs and labels since they have to be of different lengths and need different padding methods\n", + " # first treat the audio inputs by simply returning torch tensors\n", + "\n", + " input_features = [{\"input_features\": feature[\"input_features\"]} for feature in features]\n", + " batch = self.processor.feature_extractor.pad(input_features, return_tensors=\"pt\")\n", + "\n", + " # get the tokenized label sequences\n", + " label_features = [{\"input_ids\": feature[\"labels\"]} for feature in features]\n", + " # pad the labels to max length\n", + " labels_batch = self.processor.tokenizer.pad(label_features, return_tensors=\"pt\")\n", + "\n", + " # replace padding with -100 to ignore loss correctly\n", + " labels = labels_batch[\"input_ids\"].masked_fill(labels_batch.attention_mask.ne(1), -100)\n", + "\n", + " # if bos token is appended in previous tokenization step,\n", + " # cut bos token here as it's append later anyways\n", + " if (labels[:, 0] == self.processor.tokenizer.bos_token_id).all().cpu().item():\n", + " labels = labels[:, 1:]\n", + "\n", + " batch[\"labels\"] = labels\n", + "\n", + " return batch" + ] + }, + { + "cell_type": "markdown", + "id": "3cae7dbf-8a50-456e-a3a8-7fd005390f86", + "metadata": { + "id": "3cae7dbf-8a50-456e-a3a8-7fd005390f86" + }, + "source": [ + "Let's initialise the data collator we've just defined:" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "fc834702-c0d3-4a96-b101-7b87be32bf42", + "metadata": { + "id": "fc834702-c0d3-4a96-b101-7b87be32bf42" + }, + "outputs": [], + "source": [ + "data_collator = DataCollatorSpeechSeq2SeqWithPadding(processor=processor)" + ] + }, + { + "cell_type": "markdown", + "id": "d62bb2ab-750a-45e7-82e9-61d6f4805698", + "metadata": { + "id": "d62bb2ab-750a-45e7-82e9-61d6f4805698" + }, + "source": [ + "### Evaluation Metrics" + ] + }, + { + "cell_type": "markdown", + "id": "66fee1a7-a44c-461e-b047-c3917221572e", + "metadata": { + "id": "66fee1a7-a44c-461e-b047-c3917221572e" + }, + "source": [ + "We'll use the word error rate (WER) metric, the 'de-facto' metric for assessing \n", + "ASR systems. For more information, refer to the WER [docs](https://huggingface.co/metrics/wer). We'll load the WER metric from 🤗 Evaluate:" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "b22b4011-f31f-4b57-b684-c52332f92890", + "metadata": { + "id": "b22b4011-f31f-4b57-b684-c52332f92890" + }, + "outputs": [], + "source": [ + "import evaluate\n", + "\n", + "metric = evaluate.load(\"wer\")" + ] + }, + { + "cell_type": "markdown", + "id": "4f32cab6-31f0-4cb9-af4c-40ba0f5fc508", + "metadata": { + "id": "4f32cab6-31f0-4cb9-af4c-40ba0f5fc508" + }, + "source": [ + "We then simply have to define a function that takes our model \n", + "predictions and returns the WER metric. This function, called\n", + "`compute_metrics`, first replaces `-100` with the `pad_token_id`\n", + "in the `label_ids` (undoing the step we applied in the \n", + "data collator to ignore padded tokens correctly in the loss).\n", + "It then decodes the predicted and label ids to strings. Finally,\n", + "it computes the WER between the predictions and reference labels. \n", + "Here, we have the option of evaluating with the 'normalised' transcriptions \n", + "and predictions. We recommend you set this to `True` to benefit from the WER \n", + "improvement obtained by normalising the transcriptions." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "23959a70-22d0-4ffe-9fa1-72b61e75bb52", + "metadata": { + "id": "23959a70-22d0-4ffe-9fa1-72b61e75bb52" + }, + "outputs": [], + "source": [ + "# evaluate with the 'normalised' WER\n", + "do_normalize_eval = True\n", + "\n", + "def compute_metrics(pred):\n", + " pred_ids = pred.predictions\n", + " label_ids = pred.label_ids\n", + "\n", + " # replace -100 with the pad_token_id\n", + " label_ids[label_ids == -100] = processor.tokenizer.pad_token_id\n", + "\n", + " # we do not want to group tokens when computing the metrics\n", + " pred_str = processor.tokenizer.batch_decode(pred_ids, skip_special_tokens=True)\n", + " label_str = processor.tokenizer.batch_decode(label_ids, skip_special_tokens=True)\n", + "\n", + " if do_normalize_eval:\n", + " pred_str = [normalizer(pred) for pred in pred_str]\n", + " label_str = [normalizer(label) for label in label_str]\n", + "\n", + " wer = 100 * metric.compute(predictions=pred_str, references=label_str)\n", + "\n", + " return {\"wer\": wer}" + ] + }, + { + "cell_type": "markdown", + "id": "daf2a825-6d9f-4a23-b145-c37c0039075b", + "metadata": { + "id": "daf2a825-6d9f-4a23-b145-c37c0039075b" + }, + "source": [ + "### Load a Pre-Trained Checkpoint" + ] + }, + { + "cell_type": "markdown", + "id": "437a97fa-4864-476b-8abc-f28b8166cfa5", + "metadata": { + "id": "437a97fa-4864-476b-8abc-f28b8166cfa5" + }, + "source": [ + "Now let's load the pre-trained Whisper `small` checkpoint. Again, this \n", + "is trivial through use of 🤗 Transformers!" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "5a10cc4b-07ec-4ebd-ac1d-7c601023594f", + "metadata": { + "id": "5a10cc4b-07ec-4ebd-ac1d-7c601023594f" + }, + "outputs": [], + "source": [ + "from transformers import WhisperForConditionalGeneration\n", + "\n", + "model = WhisperForConditionalGeneration.from_pretrained(\"openai/whisper-large\")" + ] + }, + { + "cell_type": "markdown", + "id": "a15ead5f-2277-4a39-937b-585c2497b2df", + "metadata": { + "id": "a15ead5f-2277-4a39-937b-585c2497b2df" + }, + "source": [ + "Override generation arguments - no tokens are forced as decoder outputs (see [`forced_decoder_ids`](https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.generation_utils.GenerationMixin.generate.forced_decoder_ids)), no tokens are suppressed during generation (see [`suppress_tokens`](https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.generation_utils.GenerationMixin.generate.suppress_tokens)). Set `use_cache` to False since we're using gradient checkpointing, and the two are incompatible:" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "62038ba3-88ed-4fce-84db-338f50dcd04f", + "metadata": { + "id": "62038ba3-88ed-4fce-84db-338f50dcd04f" + }, + "outputs": [], + "source": [ + "model.config.forced_decoder_ids = None\n", + "model.config.suppress_tokens = []\n", + "model.config.use_cache = False" + ] + }, + { + "cell_type": "markdown", + "id": "2178dea4-80ca-47b6-b6ea-ba1915c90c06", + "metadata": { + "id": "2178dea4-80ca-47b6-b6ea-ba1915c90c06" + }, + "source": [ + "### Define the Training Configuration" + ] + }, + { + "cell_type": "markdown", + "id": "c21af1e9-0188-4134-ac82-defc7bdcc436", + "metadata": { + "id": "c21af1e9-0188-4134-ac82-defc7bdcc436" + }, + "source": [ + "In the final step, we define all the parameters related to training. For more detail on the training arguments, refer to the Seq2SeqTrainingArguments [docs](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.Seq2SeqTrainingArguments)." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "0ae3e9af-97b7-4aa0-ae85-20b23b5bcb3a", + "metadata": { + "id": "0ae3e9af-97b7-4aa0-ae85-20b23b5bcb3a" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2022-12-16 13:29:15,889] [INFO] [comm.py:654:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl\n" + ] + } + ], + "source": [ + "from transformers import Seq2SeqTrainingArguments\n", + "import os\n", + "\n", + "# ENABLE DEEPSPEED\n", + "os.environ[\"MASTER_ADDR\"] = \"localhost\"\n", + "os.environ[\"MASTER_PORT\"] = \"9994\" # modify if RuntimeError: Address already in use\n", + "os.environ[\"RANK\"] = \"0\"\n", + "os.environ[\"LOCAL_RANK\"] = \"0\"\n", + "os.environ[\"WORLD_SIZE\"] = \"1\"\n", + "\n", + "training_args = Seq2SeqTrainingArguments(\n", + " deepspeed=\"ds_config.json\",\n", + " output_dir=\"./\",\n", + " per_device_train_batch_size=32,\n", + " gradient_accumulation_steps=2, # increase by 2x for every 2x decrease in batch size\n", + " learning_rate=1e-5,\n", + " warmup_steps=100,\n", + " max_steps=1000,\n", + " gradient_checkpointing=True,\n", + " fp16=True,\n", + " evaluation_strategy=\"steps\",\n", + " per_device_eval_batch_size=8,\n", + " predict_with_generate=True,\n", + " generation_max_length=225,\n", + " save_steps=100,\n", + " eval_steps=100,\n", + " logging_steps=25,\n", + " report_to=[\"tensorboard\"],\n", + " load_best_model_at_end=True,\n", + " metric_for_best_model=\"wer\",\n", + " greater_is_better=False,\n", + " push_to_hub=True,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "b3a944d8-3112-4552-82a0-be25988b3857", + "metadata": { + "id": "b3a944d8-3112-4552-82a0-be25988b3857" + }, + "source": [ + "**Note**: if one does not want to upload the model checkpoints to the Hub, \n", + "set `push_to_hub=False`." + ] + }, + { + "cell_type": "markdown", + "id": "bac29114-d226-4f54-97cf-8718c9f94e1e", + "metadata": { + "id": "bac29114-d226-4f54-97cf-8718c9f94e1e" + }, + "source": [ + "We can forward the training arguments to the 🤗 Trainer along with our model,\n", + "dataset, data collator and `compute_metrics` function:" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "d546d7fe-0543-479a-b708-2ebabec19493", + "metadata": { + "id": "d546d7fe-0543-479a-b708-2ebabec19493" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/ubuntu/whisper-large-fi/./ is already a clone of https://huggingface.co/TeemuSo/whisper-large-fi. Make sure you pull the latest changes with `repo.git_pull()`.\n", + "max_steps is given, it will override any value given in num_train_epochs\n", + "Using cuda_amp half precision backend\n" + ] + } + ], + "source": [ + "from transformers import Seq2SeqTrainer\n", + "\n", + "trainer = Seq2SeqTrainer(\n", + " args=training_args,\n", + " model=model,\n", + " train_dataset=ds,\n", + " eval_dataset=ds_eval,\n", + " data_collator=data_collator,\n", + " compute_metrics=compute_metrics,\n", + " tokenizer=processor.feature_extractor,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "uOrRhDGtN5S4", + "metadata": { + "id": "uOrRhDGtN5S4" + }, + "source": [ + "We'll save the processor object once before starting training. Since the processor is not trainable, it won't change over the course of training:" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "-2zQwMfEOBJq", + "metadata": { + "id": "-2zQwMfEOBJq" + }, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Feature extractor saved in ./preprocessor_config.json\n", + "tokenizer config file saved in ./tokenizer_config.json\n", + "Special tokens file saved in ./special_tokens_map.json\n", + "added tokens file saved in ./added_tokens.json\n" + ] + } + ], + "source": [ + "processor.save_pretrained(training_args.output_dir)" + ] + }, + { + "cell_type": "markdown", + "id": "7f404cf9-4345-468c-8196-4bd101d9bd51", + "metadata": { + "id": "7f404cf9-4345-468c-8196-4bd101d9bd51" + }, + "source": [ + "### Training" + ] + }, + { + "cell_type": "markdown", + "id": "5e8b8d56-5a70-4f68-bd2e-f0752d0bd112", + "metadata": { + "id": "5e8b8d56-5a70-4f68-bd2e-f0752d0bd112" + }, + "source": [ + "Training will take approximately 5-10 hours depending on your GPU. The peak GPU memory for the given training configuration is approximately 36GB. \n", + "Depending on your GPU, it is possible that you will encounter a CUDA `\"out-of-memory\"` error when you launch training. \n", + "In this case, you can reduce the `per_device_train_batch_size` incrementally by factors of 2 \n", + "and employ [`gradient_accumulation_steps`](https://huggingface.co/docs/transformers/main_classes/trainer#transformers.Seq2SeqTrainingArguments.gradient_accumulation_steps)\n", + "to compensate.\n", + "\n", + "To launch training, simply execute:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "ee8b7b8e-1c9a-4d77-9137-1778a629e6de", + "metadata": { + "id": "ee8b7b8e-1c9a-4d77-9137-1778a629e6de" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2022-12-16 13:29:20,270] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed info: version=0.7.7, git-hash=unknown, git-branch=unknown\n", + "[2022-12-16 13:29:21,099] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Flops Profiler Enabled: False\n", + "[2022-12-16 13:29:22,285] [WARNING] [cpu_adam.py:83:__init__] FP16 params for CPUAdam may not work on AMD CPUs\n", + "Installed CUDA version 11.6 does not match the version torch was compiled with 11.7 but since the APIs are compatible, accepting this combination\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Using /home/ubuntu/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...\n", + "Detected CUDA files, patching ldflags\n", + "Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py38_cu117/cpu_adam/build.ninja...\n", + "Building extension module cpu_adam...\n", + "Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)\n", + "Loading extension module cpu_adam...\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Time to load cpu_adam op: 2.935727119445801 seconds\n", + "[2022-12-16 13:29:27,577] [INFO] [logging.py:68:log_dist] [Rank 0] Using DeepSpeed Optimizer param name adamw as basic optimizer\n", + "[2022-12-16 13:29:27,898] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Basic Optimizer = DeepSpeedCPUAdam\n", + "[2022-12-16 13:29:27,899] [INFO] [utils.py:52:is_zero_supported_optimizer] Checking ZeRO support for optimizer=DeepSpeedCPUAdam type=\n", + "[2022-12-16 13:29:27,899] [INFO] [logging.py:68:log_dist] [Rank 0] Creating fp16 ZeRO stage 2 optimizer\n", + "[2022-12-16 13:29:27,900] [INFO] [stage_1_and_2.py:140:__init__] Reduce bucket size 200000000\n", + "[2022-12-16 13:29:27,901] [INFO] [stage_1_and_2.py:141:__init__] Allgather bucket size 200000000\n", + "[2022-12-16 13:29:27,901] [INFO] [stage_1_and_2.py:142:__init__] CPU Offload: True\n", + "[2022-12-16 13:29:27,901] [INFO] [stage_1_and_2.py:143:__init__] Round robin gradient partitioning: False\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Using /home/ubuntu/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...\n", + "Emitting ninja build file /home/ubuntu/.cache/torch_extensions/py38_cu117/utils/build.ninja...\n", + "Building extension module utils...\n", + "Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)\n", + "Loading extension module utils...\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Time to load utils op: 0.45751523971557617 seconds\n", + "Rank: 0 partition count [1] and sizes[(1543304960, False)] \n", + "[2022-12-16 13:29:31,546] [INFO] [utils.py:827:see_memory_usage] Before initializing optimizer states\n", + "[2022-12-16 13:29:31,548] [INFO] [utils.py:828:see_memory_usage] MA 3.0 GB Max_MA 3.0 GB CA 5.99 GB Max_CA 6 GB \n", + "[2022-12-16 13:29:31,549] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 15.24 GB, percent = 7.8%\n", + "[2022-12-16 13:29:35,444] [INFO] [utils.py:827:see_memory_usage] After initializing optimizer states\n", + "[2022-12-16 13:29:35,445] [INFO] [utils.py:828:see_memory_usage] MA 3.0 GB Max_MA 3.0 GB CA 5.99 GB Max_CA 6 GB \n", + "[2022-12-16 13:29:35,446] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 34.89 GB, percent = 17.7%\n", + "[2022-12-16 13:29:35,447] [INFO] [stage_1_and_2.py:525:__init__] optimizer state initialized\n", + "[2022-12-16 13:29:35,575] [INFO] [utils.py:827:see_memory_usage] After initializing ZeRO optimizer\n", + "[2022-12-16 13:29:35,576] [INFO] [utils.py:828:see_memory_usage] MA 3.0 GB Max_MA 3.0 GB CA 5.99 GB Max_CA 6 GB \n", + "[2022-12-16 13:29:35,577] [INFO] [utils.py:836:see_memory_usage] CPU Virtual Memory: used = 34.89 GB, percent = 17.7%\n", + "[2022-12-16 13:29:35,602] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed Final Optimizer = adamw\n", + "[2022-12-16 13:29:35,602] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed using configured LR scheduler = WarmupLR\n", + "[2022-12-16 13:29:35,603] [INFO] [logging.py:68:log_dist] [Rank 0] DeepSpeed LR Scheduler = \n", + "[2022-12-16 13:29:35,604] [INFO] [logging.py:68:log_dist] [Rank 0] step=0, skipped=0, lr=[1e-05], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:29:35,606] [INFO] [config.py:1020:print] DeepSpeedEngine configuration:\n", + "[2022-12-16 13:29:35,607] [INFO] [config.py:1024:print] activation_checkpointing_config {\n", + " \"partition_activations\": false, \n", + " \"contiguous_memory_optimization\": false, \n", + " \"cpu_checkpointing\": false, \n", + " \"number_checkpoints\": null, \n", + " \"synchronize_checkpoint_boundary\": false, \n", + " \"profile\": false\n", + "}\n", + "[2022-12-16 13:29:35,607] [INFO] [config.py:1024:print] aio_config ................... {'block_size': 1048576, 'queue_depth': 8, 'thread_count': 1, 'single_submit': False, 'overlap_events': True}\n", + "[2022-12-16 13:29:35,608] [INFO] [config.py:1024:print] amp_enabled .................. False\n", + "[2022-12-16 13:29:35,608] [INFO] [config.py:1024:print] amp_params ................... False\n", + "[2022-12-16 13:29:35,609] [INFO] [config.py:1024:print] autotuning_config ............ {\n", + " \"enabled\": false, \n", + " \"start_step\": null, \n", + " \"end_step\": null, \n", + " \"metric_path\": null, \n", + " \"arg_mappings\": null, \n", + " \"metric\": \"throughput\", \n", + " \"model_info\": null, \n", + " \"results_dir\": \"autotuning_results\", \n", + " \"exps_dir\": \"autotuning_exps\", \n", + " \"overwrite\": true, \n", + " \"fast\": true, \n", + " \"start_profile_step\": 3, \n", + " \"end_profile_step\": 5, \n", + " \"tuner_type\": \"gridsearch\", \n", + " \"tuner_early_stopping\": 5, \n", + " \"tuner_num_trials\": 50, \n", + " \"model_info_path\": null, \n", + " \"mp_size\": 1, \n", + " \"max_train_batch_size\": null, \n", + " \"min_train_batch_size\": 1, \n", + " \"max_train_micro_batch_size_per_gpu\": 1.024000e+03, \n", + " \"min_train_micro_batch_size_per_gpu\": 1, \n", + " \"num_tuning_micro_batch_sizes\": 3\n", + "}\n", + "[2022-12-16 13:29:35,609] [INFO] [config.py:1024:print] bfloat16_enabled ............. False\n", + "[2022-12-16 13:29:35,611] [INFO] [config.py:1024:print] checkpoint_parallel_write_pipeline False\n", + "[2022-12-16 13:29:35,611] [INFO] [config.py:1024:print] checkpoint_tag_validation_enabled True\n", + "[2022-12-16 13:29:35,611] [INFO] [config.py:1024:print] checkpoint_tag_validation_fail False\n", + "[2022-12-16 13:29:35,612] [INFO] [config.py:1024:print] comms_config ................. \n", + "[2022-12-16 13:29:35,612] [INFO] [config.py:1024:print] communication_data_type ...... None\n", + "[2022-12-16 13:29:35,613] [INFO] [config.py:1024:print] compression_config ........... {'weight_quantization': {'shared_parameters': {'enabled': False, 'quantizer_kernel': False, 'schedule_offset': 0, 'quantize_groups': 1, 'quantize_verbose': False, 'quantization_type': 'symmetric', 'quantize_weight_in_forward': False, 'rounding': 'nearest', 'fp16_mixed_quantize': False, 'quantize_change_ratio': 0.001}, 'different_groups': {}}, 'activation_quantization': {'shared_parameters': {'enabled': False, 'quantization_type': 'symmetric', 'range_calibration': 'dynamic', 'schedule_offset': 1000}, 'different_groups': {}}, 'sparse_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'row_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'head_pruning': {'shared_parameters': {'enabled': False, 'method': 'topk', 'schedule_offset': 1000}, 'different_groups': {}}, 'channel_pruning': {'shared_parameters': {'enabled': False, 'method': 'l1', 'schedule_offset': 1000}, 'different_groups': {}}, 'layer_reduction': {'enabled': False}}\n", + "[2022-12-16 13:29:35,613] [INFO] [config.py:1024:print] curriculum_enabled ........... False\n", + "[2022-12-16 13:29:35,613] [INFO] [config.py:1024:print] curriculum_params ............ False\n", + "[2022-12-16 13:29:35,614] [INFO] [config.py:1024:print] dataloader_drop_last ......... False\n", + "[2022-12-16 13:29:35,614] [INFO] [config.py:1024:print] disable_allgather ............ False\n", + "[2022-12-16 13:29:35,614] [INFO] [config.py:1024:print] dump_state ................... False\n", + "[2022-12-16 13:29:35,615] [INFO] [config.py:1024:print] dynamic_loss_scale_args ...... {'init_scale': 65536, 'scale_window': 1000, 'delayed_shift': 2, 'min_scale': 1}\n", + "[2022-12-16 13:29:35,615] [INFO] [config.py:1024:print] eigenvalue_enabled ........... False\n", + "[2022-12-16 13:29:35,616] [INFO] [config.py:1024:print] eigenvalue_gas_boundary_resolution 1\n", + "[2022-12-16 13:29:35,616] [INFO] [config.py:1024:print] eigenvalue_layer_name ........ bert.encoder.layer\n", + "[2022-12-16 13:29:35,616] [INFO] [config.py:1024:print] eigenvalue_layer_num ......... 0\n", + "[2022-12-16 13:29:35,619] [INFO] [config.py:1024:print] eigenvalue_max_iter .......... 100\n", + "[2022-12-16 13:29:35,619] [INFO] [config.py:1024:print] eigenvalue_stability ......... 1e-06\n", + "[2022-12-16 13:29:35,619] [INFO] [config.py:1024:print] eigenvalue_tol ............... 0.01\n", + "[2022-12-16 13:29:35,620] [INFO] [config.py:1024:print] eigenvalue_verbose ........... False\n", + "[2022-12-16 13:29:35,621] [INFO] [config.py:1024:print] elasticity_enabled ........... False\n", + "[2022-12-16 13:29:35,621] [INFO] [config.py:1024:print] flops_profiler_config ........ {\n", + " \"enabled\": false, \n", + " \"profile_step\": 1, \n", + " \"module_depth\": -1, \n", + " \"top_modules\": 1, \n", + " \"detailed\": true, \n", + " \"output_file\": null\n", + "}\n", + "[2022-12-16 13:29:35,622] [INFO] [config.py:1024:print] fp16_auto_cast ............... False\n", + "[2022-12-16 13:29:35,622] [INFO] [config.py:1024:print] fp16_enabled ................. True\n", + "[2022-12-16 13:29:35,622] [INFO] [config.py:1024:print] fp16_master_weights_and_gradients False\n", + "[2022-12-16 13:29:35,623] [INFO] [config.py:1024:print] global_rank .................. 0\n", + "[2022-12-16 13:29:35,623] [INFO] [config.py:1024:print] grad_accum_dtype ............. None\n", + "[2022-12-16 13:29:35,623] [INFO] [config.py:1024:print] gradient_accumulation_steps .. 2\n", + "[2022-12-16 13:29:35,624] [INFO] [config.py:1024:print] gradient_clipping ............ 1.0\n", + "[2022-12-16 13:29:35,624] [INFO] [config.py:1024:print] gradient_predivide_factor .... 1.0\n", + "[2022-12-16 13:29:35,625] [INFO] [config.py:1024:print] initial_dynamic_scale ........ 65536\n", + "[2022-12-16 13:29:35,625] [INFO] [config.py:1024:print] load_universal_checkpoint .... False\n", + "[2022-12-16 13:29:35,625] [INFO] [config.py:1024:print] loss_scale ................... 0\n", + "[2022-12-16 13:29:35,626] [INFO] [config.py:1024:print] memory_breakdown ............. False\n", + "[2022-12-16 13:29:35,626] [INFO] [config.py:1024:print] monitor_config ............... \n", + "[2022-12-16 13:29:35,627] [INFO] [config.py:1024:print] nebula_config ................ {\n", + " \"enabled\": false, \n", + " \"persistent_storage_path\": null, \n", + " \"persistent_time_interval\": 100, \n", + " \"num_of_version_in_retention\": 2, \n", + " \"enable_nebula_load\": true, \n", + " \"load_path\": null\n", + "}\n", + "[2022-12-16 13:29:35,627] [INFO] [config.py:1024:print] optimizer_legacy_fusion ...... False\n", + "[2022-12-16 13:29:35,627] [INFO] [config.py:1024:print] optimizer_name ............... adamw\n", + "[2022-12-16 13:29:35,628] [INFO] [config.py:1024:print] optimizer_params ............. {'lr': 1e-05, 'betas': [0.9, 0.999], 'eps': 1e-08, 'weight_decay': 0.0}\n", + "[2022-12-16 13:29:35,628] [INFO] [config.py:1024:print] pipeline ..................... {'stages': 'auto', 'partition': 'best', 'seed_layers': False, 'activation_checkpoint_interval': 0}\n", + "[2022-12-16 13:29:35,628] [INFO] [config.py:1024:print] pld_enabled .................. False\n", + "[2022-12-16 13:29:35,629] [INFO] [config.py:1024:print] pld_params ................... False\n", + "[2022-12-16 13:29:35,629] [INFO] [config.py:1024:print] prescale_gradients ........... False\n", + "[2022-12-16 13:29:35,630] [INFO] [config.py:1024:print] scheduler_name ............... WarmupLR\n", + "[2022-12-16 13:29:35,630] [INFO] [config.py:1024:print] scheduler_params ............. {'warmup_min_lr': 0, 'warmup_max_lr': 1e-05, 'warmup_num_steps': 100}\n", + "[2022-12-16 13:29:35,630] [INFO] [config.py:1024:print] sparse_attention ............. None\n", + "[2022-12-16 13:29:35,631] [INFO] [config.py:1024:print] sparse_gradients_enabled ..... False\n", + "[2022-12-16 13:29:35,631] [INFO] [config.py:1024:print] steps_per_print .............. 10\n", + "[2022-12-16 13:29:35,631] [INFO] [config.py:1024:print] train_batch_size ............. 64\n", + "[2022-12-16 13:29:35,632] [INFO] [config.py:1024:print] train_micro_batch_size_per_gpu 32\n", + "[2022-12-16 13:29:35,632] [INFO] [config.py:1024:print] use_node_local_storage ....... False\n", + "[2022-12-16 13:29:35,633] [INFO] [config.py:1024:print] wall_clock_breakdown ......... False\n", + "[2022-12-16 13:29:35,633] [INFO] [config.py:1024:print] world_size ................... 1\n", + "[2022-12-16 13:29:35,633] [INFO] [config.py:1024:print] zero_allow_untested_optimizer False\n", + "[2022-12-16 13:29:35,634] [INFO] [config.py:1024:print] zero_config .................. stage=2 contiguous_gradients=True reduce_scatter=True reduce_bucket_size=200000000 allgather_partitions=True allgather_bucket_size=200000000 overlap_comm=True load_from_fp32_weights=True elastic_checkpoint=False offload_param=None offload_optimizer=DeepSpeedZeroOffloadOptimizerConfig(device='cpu', nvme_path=None, buffer_count=4, pin_memory=True, pipeline=False, pipeline_read=False, pipeline_write=False, fast_init=False) sub_group_size=1,000,000,000 cpu_offload_param=None cpu_offload_use_pin_memory=None cpu_offload=None prefetch_bucket_size=50,000,000 param_persistence_threshold=100,000 model_persistence_threshold=sys.maxsize max_live_parameters=1,000,000,000 max_reuse_distance=1,000,000,000 gather_16bit_weights_on_model_save=False stage3_gather_fp16_weights_on_model_save=False ignore_unused_parameters=True legacy_stage1=False round_robin_gradients=False\n", + "[2022-12-16 13:29:35,634] [INFO] [config.py:1024:print] zero_enabled ................. True\n", + "[2022-12-16 13:29:35,635] [INFO] [config.py:1024:print] zero_optimization_stage ...... 2\n", + "[2022-12-16 13:29:35,635] [INFO] [config.py:1009:print_user_config] json = {\n", + " \"fp16\": {\n", + " \"enabled\": true, \n", + " \"loss_scale\": 0, \n", + " \"loss_scale_window\": 1000, \n", + " \"initial_scale_power\": 16, \n", + " \"hysteresis\": 2, \n", + " \"min_loss_scale\": 1\n", + " }, \n", + " \"optimizer\": {\n", + " \"type\": \"AdamW\", \n", + " \"params\": {\n", + " \"lr\": 1e-05, \n", + " \"betas\": [0.9, 0.999], \n", + " \"eps\": 1e-08, \n", + " \"weight_decay\": 0.0\n", + " }\n", + " }, \n", + " \"scheduler\": {\n", + " \"type\": \"WarmupLR\", \n", + " \"params\": {\n", + " \"warmup_min_lr\": 0, \n", + " \"warmup_max_lr\": 1e-05, \n", + " \"warmup_num_steps\": 100\n", + " }\n", + " }, \n", + " \"zero_optimization\": {\n", + " \"stage\": 2, \n", + " \"offload_optimizer\": {\n", + " \"device\": \"cpu\", \n", + " \"pin_memory\": true\n", + " }, \n", + " \"allgather_partitions\": true, \n", + " \"allgather_bucket_size\": 2.000000e+08, \n", + " \"overlap_comm\": true, \n", + " \"reduce_scatter\": true, \n", + " \"reduce_bucket_size\": 2.000000e+08, \n", + " \"contiguous_gradients\": true\n", + " }, \n", + " \"gradient_accumulation_steps\": 2, \n", + " \"gradient_clipping\": 1.0, \n", + " \"train_batch_size\": 64, \n", + " \"train_micro_batch_size_per_gpu\": 32\n", + "}\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Using /home/ubuntu/.cache/torch_extensions/py38_cu117 as PyTorch extensions root...\n", + "No modifications detected for re-loaded extension module utils, skipping build step...\n", + "Loading extension module utils...\n", + "***** Running training *****\n", + " Num examples = 64000\n", + " Num Epochs = 9223372036854775807\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Time to load utils op: 0.005568742752075195 seconds\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + " Instantaneous batch size per device = 32\n", + " Total train batch size (w. parallel, distributed & accumulation) = 64\n", + " Gradient Accumulation steps = 2\n", + " Total optimization steps = 1000\n", + " Number of trainable parameters = 1543304960\n", + "Reading metadata...: 2165it [00:00, 64424.75it/s]\n", + "The following columns in the training set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: sentence, audio, input_length. If sentence, audio, input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2022-12-16 13:30:12,371] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536, reducing to 65536\n" + ] + }, + { + "data": { + "text/html": [ + "\n", + "
\n", + " \n", + " \n", + " [ 101/1000 24:25 < 3:41:52, 0.07 it/s, Epoch 0.10/9223372036854775807]\n", + "
\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
StepTraining LossValidation Loss

" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2022-12-16 13:30:26,874] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 65536, reducing to 32768.0\n", + "[2022-12-16 13:30:26,878] [INFO] [timer.py:197:stop] 0/4, RunningAvgSamplesPerSec=6.359143267427371, CurrSamplesPerSec=6.05562698727593, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:30:41,688] [INFO] [stage_1_and_2.py:1765:step] [deepspeed] OVERFLOW! Rank 0 Skipping step. Attempted loss scale: 32768.0, reducing to 16384.0\n", + "[2022-12-16 13:30:41,692] [INFO] [timer.py:197:stop] 0/6, RunningAvgSamplesPerSec=6.442681205426282, CurrSamplesPerSec=6.185670974618687, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:30:56,690] [INFO] [timer.py:197:stop] 0/8, RunningAvgSamplesPerSec=6.250138482974161, CurrSamplesPerSec=5.239387620689085, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:31:11,390] [INFO] [timer.py:197:stop] 0/10, RunningAvgSamplesPerSec=6.199217766209486, CurrSamplesPerSec=5.469275533253204, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:31:25,760] [INFO] [timer.py:197:stop] 0/12, RunningAvgSamplesPerSec=6.187350055236473, CurrSamplesPerSec=5.465893137251788, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:31:40,116] [INFO] [timer.py:197:stop] 0/14, RunningAvgSamplesPerSec=6.182433423403349, CurrSamplesPerSec=5.578369458479616, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:31:54,845] [INFO] [timer.py:197:stop] 0/16, RunningAvgSamplesPerSec=6.168209488109452, CurrSamplesPerSec=5.481925126875305, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:32:09,837] [INFO] [timer.py:197:stop] 0/18, RunningAvgSamplesPerSec=6.166182771011467, CurrSamplesPerSec=5.576634147245411, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:32:24,514] [INFO] [logging.py:68:log_dist] [Rank 0] step=10, skipped=3, lr=[4.225490200071284e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:32:24,516] [INFO] [timer.py:197:stop] 0/20, RunningAvgSamplesPerSec=6.14923830606697, CurrSamplesPerSec=5.4862895214102165, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:32:39,349] [INFO] [timer.py:197:stop] 0/22, RunningAvgSamplesPerSec=6.1362626639738735, CurrSamplesPerSec=5.400516433628815, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:32:54,284] [INFO] [timer.py:197:stop] 0/24, RunningAvgSamplesPerSec=6.116244939022191, CurrSamplesPerSec=5.331918879729864, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:33:09,152] [INFO] [timer.py:197:stop] 0/26, RunningAvgSamplesPerSec=6.111198466738003, CurrSamplesPerSec=5.46965978645552, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:33:24,156] [INFO] [timer.py:197:stop] 0/28, RunningAvgSamplesPerSec=6.1089592602745295, CurrSamplesPerSec=5.493128255256428, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:33:39,020] [INFO] [timer.py:197:stop] 0/30, RunningAvgSamplesPerSec=6.107566100664851, CurrSamplesPerSec=5.454037094012864, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:33:53,833] [INFO] [timer.py:197:stop] 0/32, RunningAvgSamplesPerSec=6.104934863126791, CurrSamplesPerSec=5.402098409851611, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:34:08,846] [INFO] [timer.py:197:stop] 0/34, RunningAvgSamplesPerSec=6.106360932585499, CurrSamplesPerSec=5.522979183752241, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:34:23,904] [INFO] [timer.py:197:stop] 0/36, RunningAvgSamplesPerSec=6.1056306050274625, CurrSamplesPerSec=5.534223579655121, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:34:38,360] [INFO] [timer.py:197:stop] 0/38, RunningAvgSamplesPerSec=6.107025499795947, CurrSamplesPerSec=5.517999025063001, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:34:52,987] [INFO] [logging.py:68:log_dist] [Rank 0] step=20, skipped=3, lr=[6.15224460689137e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:34:52,989] [INFO] [timer.py:197:stop] 0/40, RunningAvgSamplesPerSec=6.106790407037756, CurrSamplesPerSec=5.482855146086943, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:35:07,678] [INFO] [timer.py:197:stop] 0/42, RunningAvgSamplesPerSec=6.106137329541975, CurrSamplesPerSec=5.49275126263391, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:35:22,387] [INFO] [timer.py:197:stop] 0/44, RunningAvgSamplesPerSec=6.101678041171418, CurrSamplesPerSec=5.378597132969251, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:35:36,795] [INFO] [timer.py:197:stop] 0/46, RunningAvgSamplesPerSec=6.09914148974474, CurrSamplesPerSec=5.4613528889589125, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:35:51,434] [INFO] [timer.py:197:stop] 0/48, RunningAvgSamplesPerSec=6.099051394709131, CurrSamplesPerSec=5.4884021703559025, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:36:05,840] [INFO] [timer.py:197:stop] 0/50, RunningAvgSamplesPerSec=6.100214513297745, CurrSamplesPerSec=5.47020037322054, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:36:20,385] [INFO] [timer.py:197:stop] 0/52, RunningAvgSamplesPerSec=6.098726784398205, CurrSamplesPerSec=5.433712844102512, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:36:35,297] [INFO] [timer.py:197:stop] 0/54, RunningAvgSamplesPerSec=6.0936261097499305, CurrSamplesPerSec=5.489423519383715, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:36:50,352] [INFO] [timer.py:197:stop] 0/56, RunningAvgSamplesPerSec=6.0895209772752175, CurrSamplesPerSec=5.477875509886321, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:37:05,025] [INFO] [timer.py:197:stop] 0/58, RunningAvgSamplesPerSec=6.0909453489179874, CurrSamplesPerSec=5.569184849154154, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:37:20,470] [INFO] [logging.py:68:log_dist] [Rank 0] step=30, skipped=3, lr=[7.156818820794936e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:37:20,471] [INFO] [timer.py:197:stop] 0/60, RunningAvgSamplesPerSec=6.091265755090447, CurrSamplesPerSec=5.483664270842526, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:37:35,692] [INFO] [timer.py:197:stop] 0/62, RunningAvgSamplesPerSec=6.090134375608421, CurrSamplesPerSec=5.589692236464059, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:37:50,466] [INFO] [timer.py:197:stop] 0/64, RunningAvgSamplesPerSec=6.087445931775481, CurrSamplesPerSec=5.484206733168804, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:38:05,662] [INFO] [timer.py:197:stop] 0/66, RunningAvgSamplesPerSec=6.085481288319546, CurrSamplesPerSec=5.456354321207382, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:38:20,812] [INFO] [timer.py:197:stop] 0/68, RunningAvgSamplesPerSec=6.0849633855165886, CurrSamplesPerSec=5.560404510325844, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:38:35,454] [INFO] [timer.py:197:stop] 0/70, RunningAvgSamplesPerSec=6.082841357920188, CurrSamplesPerSec=5.466938858736218, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:38:50,590] [INFO] [timer.py:197:stop] 0/72, RunningAvgSamplesPerSec=6.081021514785823, CurrSamplesPerSec=5.512498905449392, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:39:05,062] [INFO] [timer.py:197:stop] 0/74, RunningAvgSamplesPerSec=6.081159270273127, CurrSamplesPerSec=5.590209079941052, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:39:19,759] [INFO] [timer.py:197:stop] 0/76, RunningAvgSamplesPerSec=6.082366611952337, CurrSamplesPerSec=5.557982892618384, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:39:34,541] [INFO] [timer.py:197:stop] 0/78, RunningAvgSamplesPerSec=6.0817232366484815, CurrSamplesPerSec=5.549053124586951, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:39:49,756] [INFO] [logging.py:68:log_dist] [Rank 0] step=40, skipped=3, lr=[7.841008620334974e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:39:49,758] [INFO] [timer.py:197:stop] 0/80, RunningAvgSamplesPerSec=6.077909701875687, CurrSamplesPerSec=5.460640310175479, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:40:04,561] [INFO] [timer.py:197:stop] 0/82, RunningAvgSamplesPerSec=6.0782675124530545, CurrSamplesPerSec=5.539460325595526, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:40:19,768] [INFO] [timer.py:197:stop] 0/84, RunningAvgSamplesPerSec=6.078390170810889, CurrSamplesPerSec=5.471118389391574, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:40:34,291] [INFO] [timer.py:197:stop] 0/86, RunningAvgSamplesPerSec=6.0795061630666805, CurrSamplesPerSec=5.454458221341045, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:40:49,017] [INFO] [timer.py:197:stop] 0/88, RunningAvgSamplesPerSec=6.07951693722567, CurrSamplesPerSec=5.479403593034189, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:41:04,035] [INFO] [timer.py:197:stop] 0/90, RunningAvgSamplesPerSec=6.07713721760037, CurrSamplesPerSec=5.456287998666598, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:41:18,947] [INFO] [timer.py:197:stop] 0/92, RunningAvgSamplesPerSec=6.077188149302228, CurrSamplesPerSec=5.50184879924073, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:41:33,488] [INFO] [timer.py:197:stop] 0/94, RunningAvgSamplesPerSec=6.07680443723328, CurrSamplesPerSec=5.473631408915449, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:41:47,873] [INFO] [timer.py:197:stop] 0/96, RunningAvgSamplesPerSec=6.078162490871368, CurrSamplesPerSec=5.581634406904709, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:42:02,601] [INFO] [timer.py:197:stop] 0/98, RunningAvgSamplesPerSec=6.07918864648028, CurrSamplesPerSec=5.5189762738392805, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:42:17,393] [INFO] [logging.py:68:log_dist] [Rank 0] step=50, skipped=3, lr=[8.360489289678585e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:42:17,394] [INFO] [timer.py:197:stop] 0/100, RunningAvgSamplesPerSec=6.079721917832034, CurrSamplesPerSec=5.565398772008929, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:42:32,340] [INFO] [timer.py:197:stop] 0/102, RunningAvgSamplesPerSec=6.079079748095393, CurrSamplesPerSec=5.53473523707476, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:42:46,647] [INFO] [timer.py:197:stop] 0/104, RunningAvgSamplesPerSec=6.080121170095116, CurrSamplesPerSec=5.546972622941274, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:43:01,497] [INFO] [timer.py:197:stop] 0/106, RunningAvgSamplesPerSec=6.080171891208223, CurrSamplesPerSec=5.554791525418257, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:43:16,886] [INFO] [timer.py:197:stop] 0/108, RunningAvgSamplesPerSec=6.081835087987497, CurrSamplesPerSec=5.58878961931587, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:43:31,557] [INFO] [timer.py:197:stop] 0/110, RunningAvgSamplesPerSec=6.081642070013767, CurrSamplesPerSec=5.4866510488093745, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:43:46,623] [INFO] [timer.py:197:stop] 0/112, RunningAvgSamplesPerSec=6.081178765969859, CurrSamplesPerSec=5.503815912407227, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:44:01,079] [INFO] [timer.py:197:stop] 0/114, RunningAvgSamplesPerSec=6.081720835190203, CurrSamplesPerSec=5.504593308578239, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:44:16,482] [INFO] [timer.py:197:stop] 0/116, RunningAvgSamplesPerSec=6.080214101573954, CurrSamplesPerSec=5.447973428766622, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:44:30,878] [INFO] [timer.py:197:stop] 0/118, RunningAvgSamplesPerSec=6.080946889626902, CurrSamplesPerSec=5.482487848212496, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:44:45,818] [INFO] [logging.py:68:log_dist] [Rank 0] step=60, skipped=3, lr=[8.779374278362457e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:44:45,820] [INFO] [timer.py:197:stop] 0/120, RunningAvgSamplesPerSec=6.079880158785482, CurrSamplesPerSec=5.410680853516901, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:45:00,308] [INFO] [timer.py:197:stop] 0/122, RunningAvgSamplesPerSec=6.0800487826701, CurrSamplesPerSec=5.51626365654903, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:45:14,816] [INFO] [timer.py:197:stop] 0/124, RunningAvgSamplesPerSec=6.079264706998162, CurrSamplesPerSec=5.5087795351327316, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:45:29,834] [INFO] [timer.py:197:stop] 0/126, RunningAvgSamplesPerSec=6.079846025945439, CurrSamplesPerSec=5.4936159271225256, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:45:44,851] [INFO] [timer.py:197:stop] 0/128, RunningAvgSamplesPerSec=6.078175014056017, CurrSamplesPerSec=5.362789672759209, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:45:59,717] [INFO] [timer.py:197:stop] 0/130, RunningAvgSamplesPerSec=6.079084147393918, CurrSamplesPerSec=5.541067573145952, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:46:14,797] [INFO] [timer.py:197:stop] 0/132, RunningAvgSamplesPerSec=6.078861152800443, CurrSamplesPerSec=5.477529668210021, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:46:29,130] [INFO] [timer.py:197:stop] 0/134, RunningAvgSamplesPerSec=6.080307858784291, CurrSamplesPerSec=5.559784456324279, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:46:44,072] [INFO] [timer.py:197:stop] 0/136, RunningAvgSamplesPerSec=6.07930472994486, CurrSamplesPerSec=5.580544815418963, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:46:58,724] [INFO] [timer.py:197:stop] 0/138, RunningAvgSamplesPerSec=6.078143442074857, CurrSamplesPerSec=5.314022303909574, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:47:13,565] [INFO] [logging.py:68:log_dist] [Rank 0] step=70, skipped=3, lr=[9.130374013504131e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:47:13,567] [INFO] [timer.py:197:stop] 0/140, RunningAvgSamplesPerSec=6.077773018304035, CurrSamplesPerSec=5.36396394293752, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:47:28,585] [INFO] [timer.py:197:stop] 0/142, RunningAvgSamplesPerSec=6.077639901148956, CurrSamplesPerSec=5.461940956484987, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:47:43,370] [INFO] [timer.py:197:stop] 0/144, RunningAvgSamplesPerSec=6.078145299833403, CurrSamplesPerSec=5.518752068230726, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:47:58,138] [INFO] [timer.py:197:stop] 0/146, RunningAvgSamplesPerSec=6.077895578341843, CurrSamplesPerSec=5.5512169228705694, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:48:13,249] [INFO] [timer.py:197:stop] 0/148, RunningAvgSamplesPerSec=6.077814599805705, CurrSamplesPerSec=5.491996533706707, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:48:27,903] [INFO] [timer.py:197:stop] 0/150, RunningAvgSamplesPerSec=6.076902008713945, CurrSamplesPerSec=5.332959519203826, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:48:42,061] [INFO] [timer.py:197:stop] 0/152, RunningAvgSamplesPerSec=6.0783920477415, CurrSamplesPerSec=5.55632004431884, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:48:56,535] [INFO] [timer.py:197:stop] 0/154, RunningAvgSamplesPerSec=6.078732911919996, CurrSamplesPerSec=5.458582053389676, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:49:11,153] [INFO] [timer.py:197:stop] 0/156, RunningAvgSamplesPerSec=6.079008773246385, CurrSamplesPerSec=5.543700002028015, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:49:25,910] [INFO] [timer.py:197:stop] 0/158, RunningAvgSamplesPerSec=6.077901196166508, CurrSamplesPerSec=5.497338259422809, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:49:41,537] [INFO] [logging.py:68:log_dist] [Rank 0] step=80, skipped=3, lr=[9.432453625862409e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:49:41,538] [INFO] [timer.py:197:stop] 0/160, RunningAvgSamplesPerSec=6.076246622231216, CurrSamplesPerSec=5.371769954520086, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:49:56,401] [INFO] [timer.py:197:stop] 0/162, RunningAvgSamplesPerSec=6.076453900936628, CurrSamplesPerSec=5.506221269741678, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:50:10,825] [INFO] [timer.py:197:stop] 0/164, RunningAvgSamplesPerSec=6.077190405328247, CurrSamplesPerSec=5.540300881158864, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:50:26,051] [INFO] [timer.py:197:stop] 0/166, RunningAvgSamplesPerSec=6.077041030460407, CurrSamplesPerSec=5.509338737628304, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:50:40,658] [INFO] [timer.py:197:stop] 0/168, RunningAvgSamplesPerSec=6.077215911476895, CurrSamplesPerSec=5.465386561501869, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:50:55,458] [INFO] [timer.py:197:stop] 0/170, RunningAvgSamplesPerSec=6.078024759766047, CurrSamplesPerSec=5.50886816782787, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:51:10,316] [INFO] [timer.py:197:stop] 0/172, RunningAvgSamplesPerSec=6.077383165843686, CurrSamplesPerSec=5.523170094873224, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:51:25,012] [INFO] [timer.py:197:stop] 0/174, RunningAvgSamplesPerSec=6.077781559393984, CurrSamplesPerSec=5.523025319397122, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:51:39,852] [INFO] [timer.py:197:stop] 0/176, RunningAvgSamplesPerSec=6.078433625571553, CurrSamplesPerSec=5.568140074904409, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:51:54,462] [INFO] [timer.py:197:stop] 0/178, RunningAvgSamplesPerSec=6.077703528416714, CurrSamplesPerSec=5.485578492081232, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:52:10,224] [INFO] [logging.py:68:log_dist] [Rank 0] step=90, skipped=3, lr=[9.697596263093091e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:52:10,225] [INFO] [timer.py:197:stop] 0/180, RunningAvgSamplesPerSec=6.078332169068046, CurrSamplesPerSec=5.557059195233783, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:52:24,902] [INFO] [timer.py:197:stop] 0/182, RunningAvgSamplesPerSec=6.077463728768749, CurrSamplesPerSec=5.458477050176381, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:52:40,291] [INFO] [timer.py:197:stop] 0/184, RunningAvgSamplesPerSec=6.076290749991055, CurrSamplesPerSec=5.492102156302423, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:52:54,947] [INFO] [timer.py:197:stop] 0/186, RunningAvgSamplesPerSec=6.07565767596194, CurrSamplesPerSec=5.388023243640082, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:53:09,877] [INFO] [timer.py:197:stop] 0/188, RunningAvgSamplesPerSec=6.075693478896997, CurrSamplesPerSec=5.5255764788423285, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:53:24,501] [INFO] [timer.py:197:stop] 0/190, RunningAvgSamplesPerSec=6.0754839341391245, CurrSamplesPerSec=5.539336641858779, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:53:38,944] [INFO] [timer.py:197:stop] 0/192, RunningAvgSamplesPerSec=6.075775207227371, CurrSamplesPerSec=5.469889160531412, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:53:53,465] [INFO] [timer.py:197:stop] 0/194, RunningAvgSamplesPerSec=6.075782725725061, CurrSamplesPerSec=5.488814031317932, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:54:08,605] [INFO] [timer.py:197:stop] 0/196, RunningAvgSamplesPerSec=6.075932541941936, CurrSamplesPerSec=5.513335374613696, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:54:23,298] [INFO] [timer.py:197:stop] 0/198, RunningAvgSamplesPerSec=6.076327542642147, CurrSamplesPerSec=5.506319307844521, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n", + "[2022-12-16 13:54:38,322] [INFO] [logging.py:68:log_dist] [Rank 0] step=100, skipped=3, lr=[9.933858671331224e-06], mom=[[0.9, 0.999]]\n", + "[2022-12-16 13:54:38,323] [INFO] [timer.py:197:stop] 0/200, RunningAvgSamplesPerSec=6.07605192544789, CurrSamplesPerSec=5.413820537564129, MemAllocated=3.0GB, MaxMemAllocated=19.53GB\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "***** Running Evaluation *****\n", + " Num examples: Unknown\n", + " Batch size = 8\n", + "Reading metadata...: 1704it [00:00, 13668.60it/s]\n", + "The following columns in the evaluation set don't have a corresponding argument in `WhisperForConditionalGeneration.forward` and have been ignored: sentence, audio, input_length. If sentence, audio, input_length are not expected by `WhisperForConditionalGeneration.forward`, you can safely ignore this message.\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "/home/ubuntu/.local/lib/python3.8/site-packages/transformers/generation/utils.py:1134: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)\n", + " warnings.warn(\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n", + "Generate config GenerationConfig {\n", + " \"begin_suppress_tokens\": [\n", + " 220,\n", + " 50257\n", + " ],\n", + " \"bos_token_id\": 50257,\n", + " \"decoder_start_token_id\": 50258,\n", + " \"eos_token_id\": 50257,\n", + " \"max_length\": 448,\n", + " \"pad_token_id\": 50257,\n", + " \"suppress_tokens\": [],\n", + " \"transformers_version\": \"4.26.0.dev0\",\n", + " \"use_cache\": false\n", + "}\n", + "\n" + ] + } + ], + "source": [ + "trainer.train()" + ] + }, + { + "cell_type": "markdown", + "id": "810ced54-7187-4a06-b2fe-ba6dcca94dc3", + "metadata": { + "id": "810ced54-7187-4a06-b2fe-ba6dcca94dc3" + }, + "source": [ + "We can label our checkpoint with the `whisper-event` tag on push by setting the appropriate key-word arguments (kwargs):" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c704f91e-241b-48c9-b8e0-f0da396a9663", + "metadata": { + "id": "c704f91e-241b-48c9-b8e0-f0da396a9663" + }, + "outputs": [], + "source": [ + "kwargs = {\n", + " \"dataset_tags\": dataset_names,\n", + " \"dataset\": \"Common Voice 11.0, FB Voxpopuli, Google FLEURS\", # a 'pretty' name for the training dataset\n", + " \"language\": \"fi\",\n", + " \"model_name\": \"Whisper Large Fi - Sormunen Teemu\", # a 'pretty' name for your model\n", + " \"finetuned_from\": \"openai/whisper-large\",\n", + " \"tasks\": \"automatic-speech-recognition\",\n", + " \"tags\": \"whisper-event\",\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "090d676a-f944-4297-a938-a40eda0b2b68", + "metadata": { + "id": "090d676a-f944-4297-a938-a40eda0b2b68" + }, + "source": [ + "The training results can now be uploaded to the Hub. To do so, execute the `push_to_hub` command and save the preprocessor object we created:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d7030622-caf7-4039-939b-6195cdaa2585", + "metadata": { + "id": "d7030622-caf7-4039-939b-6195cdaa2585" + }, + "outputs": [], + "source": [ + "trainer.push_to_hub(**kwargs)" + ] + }, + { + "cell_type": "markdown", + "id": "ca743fbd-602c-48d4-ba8d-a2fe60af64ba", + "metadata": { + "id": "ca743fbd-602c-48d4-ba8d-a2fe60af64ba" + }, + "source": [ + "## Closing Remarks" + ] + }, + { + "cell_type": "markdown", + "id": "7f737783-2870-4e35-aa11-86a42d7d997a", + "metadata": { + "id": "7f737783-2870-4e35-aa11-86a42d7d997a" + }, + "source": [ + "In this blog, we covered a step-by-step guide on fine-tuning Whisper for multilingual ASR \n", + "using 🤗 Datasets, Transformers and the Hugging Face Hub. For more details on the Whisper model, the Common Voice dataset and the theory behind fine-tuning, refere to the accompanying [blog post](https://huggingface.co/blog/fine-tune-whisper). If you're interested in fine-tuning other \n", + "Transformers models, both for English and multilingual ASR, be sure to check out the \n", + "examples scripts at [examples/pytorch/speech-recognition](https://github.com/huggingface/transformers/tree/main/examples/pytorch/speech-recognition)." + ] + } + ], + "metadata": { + "colab": { + "include_colab_link": true, + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}