{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "accelerator": "GPU", "colab": { "name": "Copy of starter_notebook_reverse_training.ipynb", "provenance": [], "collapsed_sections": [], "toc_visible": true }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "Igc5itf-xMGj" }, "source": [ "# Masakhane - Reverse Machine Translation for African Languages (Using JoeyNMT)" ] }, { "cell_type": "markdown", "metadata": { "id": "SSu9Tv5Q4Ezc" }, "source": [ "> ## NB\n", ">### - The purpose of this Notebook is to build models that translate African languages(target language) *into* English(source language). This will allow us to in future be able to make translations from one African language to the other. If you'd like to translate *from* English, please use [this](https://github.com/masakhane-io/masakhane-mt/blob/master/starter_notebook.ipynb) starter notebook instead.\n", "\n", ">### - We call this reverse training because normally we build models that make translations from the source language(English) to the target language. But in this case we are doing the reverse; building models that make translations from the target language to the source(English)" ] }, { "cell_type": "markdown", "metadata": { "id": "x4fXCKCf36IK" }, "source": [ "## Note before beginning:\n", "### - The idea is that you should be able to make minimal changes to this in order to get SOME result for your own translation corpus. \n", "\n", "### - The tl;dr: Go to the **\"TODO\"** comments which will tell you what to update to get up and running\n", "\n", "### - If you actually want to have a clue what you're doing, read the text and peek at the links\n", "\n", "### - With 100 epochs, it should take around 7 hours to run in Google Colab\n", "\n", "### - Once you've gotten a result for your language, please attach and email your notebook that generated it to masakhanetranslation@gmail.com\n", "\n", "### - If you care enough and get a chance, doing a brief background on your language would be amazing. See examples in [(Martinus, 2019)](https://arxiv.org/abs/1906.05685)" ] }, { "cell_type": "markdown", "metadata": { "id": "l929HimrxS0a" }, "source": [ "## Retrieve your data & make a parallel corpus\n", "\n", "If you are wanting to use the JW300 data referenced on the Masakhane website or in our GitHub repo, you can use `opus-tools` to convert the data into a convenient format. `opus_read` from that package provides a convenient tool for reading the native aligned XML files and to convert them to TMX format. The tool can also be used to fetch relevant files from OPUS on the fly and to filter the data as necessary. [Read the documentation](https://pypi.org/project/opustools-pkg/) for more details.\n", "\n", "Once you have your corpus files in TMX format (an xml structure which will include the sentences in your target language and your source language in a single file), we recommend reading them into a pandas dataframe. Thankfully, Jade wrote a silly `tmx2dataframe` package which converts your tmx file to a pandas dataframe. " ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "oGRmDELn7Az0", "outputId": "f7dfe4ec-fa15-4fa5-9e67-ff9cee5e1aca" }, "source": [ "from google.colab import drive\n", "drive.mount('/content/drive')" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "Mounted at /content/drive\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "Cn3tgQLzUxwn" }, "source": [ "# TODO: Set your source and target languages. Keep in mind, these traditionally use language codes as found here:\n", "# These will also become the suffix's of all vocab and corpus files used throughout\n", "import os\n", "source_language = \"en\"\n", "target_language = \"sn\" \n", "lc = False # If True, lowercase the data.\n", "seed = 42 # Random seed for shuffling.\n", "tag = \"baseline\" # Give a unique name to your folder - this is to ensure you don't rewrite any models you've already submitted\n", "\n", "os.environ[\"src\"] = source_language # Sets them in bash as well, since we often use bash scripts\n", "os.environ[\"tgt\"] = target_language\n", "os.environ[\"tag\"] = tag\n", "\n", "# This will save it to a folder in our gdrive instead!\n", "!mkdir -p \"/content/drive/My Drive/masakhane/$tgt-$src-$tag\"\n", "os.environ[\"gdrive_path\"] = \"/content/drive/My Drive/masakhane/%s-%s-%s\" % (target_language, source_language, tag)" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "kBSgJHEw7Nvx", "outputId": "91df9fa3-9b77-40a4-dedd-a32f6d99cccf" }, "source": [ "!echo $gdrive_path" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "/content/drive/My Drive/masakhane/sn-en-baseline\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "gA75Fs9ys8Y9", "outputId": "016833bd-a92c-4627-c43f-d790e526c80f" }, "source": [ "# Install opus-tools\n", "! pip install opustools-pkg" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "Collecting opustools-pkg\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/6c/9f/e829a0cceccc603450cd18e1ff80807b6237a88d9a8df2c0bb320796e900/opustools_pkg-0.0.52-py3-none-any.whl (80kB)\n", "\r\u001b[K |████ | 10kB 18.3MB/s eta 0:00:01\r\u001b[K |████████ | 20kB 24.9MB/s eta 0:00:01\r\u001b[K |████████████▏ | 30kB 20.5MB/s eta 0:00:01\r\u001b[K |████████████████▏ | 40kB 18.5MB/s eta 0:00:01\r\u001b[K |████████████████████▎ | 51kB 16.9MB/s eta 0:00:01\r\u001b[K |████████████████████████▎ | 61kB 13.4MB/s eta 0:00:01\r\u001b[K |████████████████████████████▎ | 71kB 13.2MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 81kB 7.4MB/s \n", "\u001b[?25hInstalling collected packages: opustools-pkg\n", "Successfully installed opustools-pkg-0.0.52\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "xq-tDZVks7ZD", "outputId": "3a1d2630-a935-4ea6-c398-7d915db12e6f" }, "source": [ "# Downloading our corpus\n", "! opus_read -d JW300 -s $src -t $tgt -wm moses -w jw300.$src jw300.$tgt -q\n", "\n", "# extract the corpus file\n", "! gunzip JW300_latest_xml_$src-$tgt.xml.gz" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "\n", "Alignment file /proj/nlpl/data/OPUS/JW300/latest/xml/en-sn.xml.gz not found. The following files are available for downloading:\n", "\n", " 8 MB https://object.pouta.csc.fi/OPUS-JW300/v1b/xml/en-sn.xml.gz\n", " 263 MB https://object.pouta.csc.fi/OPUS-JW300/v1b/xml/en.zip\n", " 69 MB https://object.pouta.csc.fi/OPUS-JW300/v1b/xml/sn.zip\n", "\n", " 340 MB Total size\n", "./JW300_latest_xml_en-sn.xml.gz ... 100% of 8 MB\n", "./JW300_latest_xml_en.zip ... 100% of 263 MB\n", "./JW300_latest_xml_sn.zip ... 100% of 69 MB\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "n48GDRnP8y2G", "outputId": "6821182b-48dc-4188-9d5c-c1c0f3abb444" }, "source": [ "# Download the global test set.\n", "! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-any.en\n", " \n", "# And the specific test set for this language pair.\n", "os.environ[\"trg\"] = target_language \n", "os.environ[\"src\"] = source_language \n", "\n", "! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-$trg.en \n", "! mv test.en-$trg.en test.en\n", "! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-$trg.$trg \n", "! mv test.en-$trg.$trg test.$trg" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "--2021-05-08 15:49:00-- https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-any.en\n", "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n", "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 277791 (271K) [text/plain]\n", "Saving to: ‘test.en-any.en’\n", "\n", "\rtest.en-any.en 0%[ ] 0 --.-KB/s \rtest.en-any.en 100%[===================>] 271.28K --.-KB/s in 0.006s \n", "\n", "2021-05-08 15:49:00 (45.6 MB/s) - ‘test.en-any.en’ saved [277791/277791]\n", "\n", "--2021-05-08 15:49:00-- https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-sn.en\n", "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...\n", "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 206539 (202K) [text/plain]\n", "Saving to: ‘test.en-sn.en’\n", "\n", "test.en-sn.en 100%[===================>] 201.70K --.-KB/s in 0.004s \n", "\n", "2021-05-08 15:49:00 (52.3 MB/s) - ‘test.en-sn.en’ saved [206539/206539]\n", "\n", "--2021-05-08 15:49:00-- https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-sn.sn\n", "Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...\n", "Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.\n", "HTTP request sent, awaiting response... 200 OK\n", "Length: 215910 (211K) [text/plain]\n", "Saving to: ‘test.en-sn.sn’\n", "\n", "test.en-sn.sn 100%[===================>] 210.85K --.-KB/s in 0.005s \n", "\n", "2021-05-08 15:49:00 (45.4 MB/s) - ‘test.en-sn.sn’ saved [215910/215910]\n", "\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "NqDG-CI28y2L", "outputId": "e2a870b9-589e-467f-885b-b1ef3a3c805c" }, "source": [ "# Read the test data to filter from train and dev splits.\n", "# Store english portion in set for quick filtering checks.\n", "en_test_sents = set()\n", "filter_test_sents = \"test.en-any.en\"\n", "j = 0\n", "with open(filter_test_sents) as f:\n", " for line in f:\n", " en_test_sents.add(line.strip())\n", " j += 1\n", "print('Loaded {} global test sentences to filter from the training/dev data.'.format(j))" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "Loaded 3571 global test sentences to filter from the training/dev data.\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 159 }, "id": "3CNdwLBCfSIl", "outputId": "df76c106-80d6-41ff-fdd0-e483d1784ead" }, "source": [ "import pandas as pd\n", "\n", "# TMX file to dataframe\n", "source_file = 'jw300.' + source_language\n", "target_file = 'jw300.' + target_language\n", "\n", "source = []\n", "target = []\n", "skip_lines = [] # Collect the line numbers of the source portion to skip the same lines for the target portion.\n", "with open(source_file) as f:\n", " for i, line in enumerate(f):\n", " # Skip sentences that are contained in the test set.\n", " if line.strip() not in en_test_sents:\n", " source.append(line.strip())\n", " else:\n", " skip_lines.append(i) \n", "with open(target_file) as f:\n", " for j, line in enumerate(f):\n", " # Only add to corpus if corresponding source was not skipped.\n", " if j not in skip_lines:\n", " target.append(line.strip())\n", " \n", "print('Loaded data and skipped {}/{} lines since contained in test set.'.format(len(skip_lines), i))\n", " \n", "df = pd.DataFrame(zip(source, target), columns=['source_sentence', 'target_sentence'])\n", "# if you get TypeError: data argument can't be an iterator is because of your zip version run this below\n", "#df = pd.DataFrame(list(zip(source, target)), columns=['source_sentence', 'target_sentence'])\n", "df.head(3)" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "Loaded data and skipped 6157/786529 lines since contained in test set.\n" ], "name": "stdout" }, { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
source_sentencetarget_sentence
0Young People Ask . . .Vechiduku Vanobvunza Kuti . . .
1Why Do I Lose My Temper ?Neiko Ndichitsamwa ?
2“ When I’m angry , I’m furious , and you would...“ Apo ndinoshatirwa , ndinotyisa , uye haungad...
\n", "
" ], "text/plain": [ " source_sentence target_sentence\n", "0 Young People Ask . . . Vechiduku Vanobvunza Kuti . . .\n", "1 Why Do I Lose My Temper ? Neiko Ndichitsamwa ?\n", "2 “ When I’m angry , I’m furious , and you would... “ Apo ndinoshatirwa , ndinotyisa , uye haungad..." ] }, "metadata": { "tags": [] }, "execution_count": 8 } ] }, { "cell_type": "markdown", "metadata": { "id": "YkuK3B4p2AkN" }, "source": [ "## Pre-processing and export\n", "\n", "It is generally a good idea to remove duplicate translations and conflicting translations from the corpus. In practice, these public corpora include some number of these that need to be cleaned.\n", "\n", "In addition we will split our data into dev/test/train and export to the filesystem." ] }, { "cell_type": "code", "metadata": { "id": "M_2ouEOH1_1q", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "1e553d14-e07a-4536-ec99-795a23e74aeb" }, "source": [ "# drop duplicate translations\n", "df_pp = df.drop_duplicates()\n", "\n", "# drop conflicting translations\n", "# (this is optional and something that you might want to comment out \n", "# depending on the size of your corpus)\n", "df_pp.drop_duplicates(subset='source_sentence', inplace=True)\n", "df_pp.drop_duplicates(subset='target_sentence', inplace=True)\n", "\n", "# Shuffle the data to remove bias in dev set selection.\n", "df_pp = df_pp.sample(frac=1, random_state=seed).reset_index(drop=True)" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:7: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", " import sys\n", "/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:8: SettingWithCopyWarning: \n", "A value is trying to be set on a copy of a slice from a DataFrame\n", "\n", "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n", " \n" ], "name": "stderr" } ] }, { "cell_type": "code", "metadata": { "id": "Z_1BwAApEtMk", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "eca12159-ded0-4ee5-fc23-be9e82e56ccb" }, "source": [ "# Install fuzzy wuzzy to remove \"almost duplicate\" sentences in the\n", "# test and training sets.\n", "! pip install fuzzywuzzy\n", "! pip install python-Levenshtein\n", "import time\n", "from fuzzywuzzy import process\n", "import numpy as np\n", "from os import cpu_count\n", "from functools import partial\n", "from multiprocessing import Pool\n", "\n", "\n", "# reset the index of the training set after previous filtering\n", "df_pp.reset_index(drop=False, inplace=True)\n", "\n", "# Remove samples from the training data set if they \"almost overlap\" with the\n", "# samples in the test set.\n", "\n", "# Filtering function. Adjust pad to narrow down the candidate matches to\n", "# within a certain length of characters of the given sample.\n", "def fuzzfilter(sample, candidates, pad):\n", " candidates = [x for x in candidates if len(x) <= len(sample)+pad and len(x) >= len(sample)-pad] \n", " if len(candidates) > 0:\n", " return process.extractOne(sample, candidates)[1]\n", " else:\n", " return np.nan" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "Collecting fuzzywuzzy\n", " Downloading https://files.pythonhosted.org/packages/43/ff/74f23998ad2f93b945c0309f825be92e04e0348e062026998b5eefef4c33/fuzzywuzzy-0.18.0-py2.py3-none-any.whl\n", "Installing collected packages: fuzzywuzzy\n", "Successfully installed fuzzywuzzy-0.18.0\n", "Collecting python-Levenshtein\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/2a/dc/97f2b63ef0fa1fd78dcb7195aca577804f6b2b51e712516cc0e902a9a201/python-Levenshtein-0.12.2.tar.gz (50kB)\n", "\u001b[K |████████████████████████████████| 51kB 7.1MB/s \n", "\u001b[?25hRequirement already satisfied: setuptools in /usr/local/lib/python3.7/dist-packages (from python-Levenshtein) (56.1.0)\n", "Building wheels for collected packages: python-Levenshtein\n", " Building wheel for python-Levenshtein (setup.py) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for python-Levenshtein: filename=python_Levenshtein-0.12.2-cp37-cp37m-linux_x86_64.whl size=149802 sha256=db760f57697227a2b4ab6d1339e325c5c5e88d1c2287a7cfce7d74da9b06efca\n", " Stored in directory: /root/.cache/pip/wheels/b3/26/73/4b48503bac73f01cf18e52cd250947049a7f339e940c5df8fc\n", "Successfully built python-Levenshtein\n", "Installing collected packages: python-Levenshtein\n", "Successfully installed python-Levenshtein-0.12.2\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "92EsgTaY3B4H" }, "source": [ "# start_time = time.time()\n", "# ### iterating over pandas dataframe rows is not recomended, let use multi processing to apply the function\n", "\n", "# with Pool(cpu_count()-1) as pool:\n", "# scores = pool.map(partial(fuzzfilter, candidates=list(en_test_sents), pad=5), df_pp['source_sentence'])\n", "# hours, rem = divmod(time.time() - start_time, 3600)\n", "# minutes, seconds = divmod(rem, 60)\n", "# print(\"done in {}h:{}min:{}seconds\".format(hours, minutes, seconds))\n", "\n", "# # Filter out \"almost overlapping samples\"\n", "# df_pp = df_pp.assign(scores=scores)\n", "# df_pp = df_pp[df_pp['scores'] < 95]" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "hxxBOCA-xXhy", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "17a9d5d6-1fe3-42c0-f326-1f983658a004" }, "source": [ "# This section does the split between train/dev for the parallel corpora then saves them as separate files\n", "# We use 1000 dev test and the given test set.\n", "import csv\n", "\n", "# Do the split between dev/train and create parallel corpora\n", "num_dev_patterns = 1000\n", "\n", "# Optional: lower case the corpora - this will make it easier to generalize, but without proper casing.\n", "if lc: # Julia: making lowercasing optional\n", " df_pp[\"source_sentence\"] = df_pp[\"source_sentence\"].str.lower()\n", " df_pp[\"target_sentence\"] = df_pp[\"target_sentence\"].str.lower()\n", "\n", "# Julia: test sets are already generated\n", "dev = df_pp.tail(num_dev_patterns) # Herman: Error in original\n", "stripped = df_pp.drop(df_pp.tail(num_dev_patterns).index)\n", "\n", "with open(\"train.\"+source_language, \"w\") as src_file, open(\"train.\"+target_language, \"w\") as trg_file:\n", " for index, row in stripped.iterrows():\n", " src_file.write(row[\"source_sentence\"]+\"\\n\")\n", " trg_file.write(row[\"target_sentence\"]+\"\\n\")\n", " \n", "with open(\"dev.\"+source_language, \"w\") as src_file, open(\"dev.\"+target_language, \"w\") as trg_file:\n", " for index, row in dev.iterrows():\n", " src_file.write(row[\"source_sentence\"]+\"\\n\")\n", " trg_file.write(row[\"target_sentence\"]+\"\\n\")\n", "\n", "#stripped[[\"source_sentence\"]].to_csv(\"train.\"+source_language, header=False, index=False) # Herman: Added `header=False` everywhere\n", "#stripped[[\"target_sentence\"]].to_csv(\"train.\"+target_language, header=False, index=False) # Julia: Problematic handling of quotation marks.\n", "\n", "#dev[[\"source_sentence\"]].to_csv(\"dev.\"+source_language, header=False, index=False)\n", "#dev[[\"target_sentence\"]].to_csv(\"dev.\"+target_language, header=False, index=False)\n", "\n", "# Doublecheck the format below. There should be no extra quotation marks or weird characters.\n", "! head train.*\n", "! head dev.*" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "==> train.en <==\n", "◆ Manifesting the fruitage of the spirit\n", "Some may say : ‘ That’s all very well , but what if a child comes along unexpectedly ? ’\n", "Such enthusiasm can be infectious .\n", "Still , I was curious about the Bible and occasionally read the copy Andre kept in his bedroom .\n", "Our marriage has been one of the greatest blessings Jehovah has bestowed upon me .\n", "The God - appointed Head of that Kingdom government is the Prince of Peace , Jesus Christ . — Isaiah 9 : 6 .\n", "At one time , this house was in good condition ​ — but no longer .\n", "His encouragement and counsel help us to deal with these critical times . ​ — 2 Timothy 3 : 1 - 5 .\n", "□ Above all , look to the Scriptures for comfort , praying to Jehovah , even calling aloud his name , during and after the assault .\n", "Yet , as Paul explained , Jesus came that “ he might release by purchase those under law . ”\n", "\n", "==> train.sn <==\n", "◆ Kuratidza chibereko chomudzimu\n", "Vamwe vangati : ‘ Zvose izvozvo zvakanaka zvikuru , asi zvakadiniko kana mwana akauya nenzira isingakarirwi ? ’\n", "Mbavarira yakadaro inogona kutapukira .\n", "Asi ndakanga ndichida kunyatsoziva zvinotaurwa neBhaibheri zvokuti nguva nenguva ndaiverenga Bhaibheri raAndre raaichengeta mubhedhurumu make .\n", "Kuroorana kwatakaita kwave kuri chimwe chezvikomborero zvikuru zvandakapiwa naJehovha .\n", "Mutungamiriri akasarudzwa naMwari wehurumende iyoyo yoUmambo Muchinda Worugare , Jesu Kristu . — Isaya 9 : 6 .\n", "Imba iyi yaimbova yakanaka , asi iye zvino yashata .\n", "Kurudziro yake nezano zvinotibatsira kubata nenguva dzino dzinonetsa . — 2 Timoti 3 : 1 - 5 .\n", "□ Kupfuura zvose , tarira kuMagwaro nokuda kwenyaradzo , uchinyengetera kuna Jehovha , kunyange kudana zita rake zvinonzwika , mukati uye pashure pokudenhwa .\n", "Asi sokutsanangura kwakaita Pauro , Jesu akauya kuti “ asunungure nokutenga vaya vaiva pasi pomutemo . ”\n", "==> dev.en <==\n", "She did not attempt to take over .\n", "For whom is Jehovah searching , and why ?\n", "All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "Do Not Demand Perfection\n", "AS TOLD BY SAMUEL D .\n", "It is not only the towns , but villages and rural districts too which are infected through contact with this wretched cult . ”\n", "Rural witnessing in Minas Gerais\n", "It is vital that we teach our children to respect others .\n", "Or will you endure , as did Joseph ? ”\n", "He will be great and will be called Son of the Most High . ”\n", "\n", "==> dev.sn <==\n", "Haana kuedza kutora basa racho .\n", "Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "Usatarisira Kuti Vave Vakakwana\n", "YAKATAURWA NASAMUEL D .\n", "Hamusi mumaguta chete , asiwo kumamisha nokumatunhu okumaruwa kwazadzwa nekasangano aka kanosvota . ”\n", "Kupupurira muruwa muMinas Gerais\n", "Zvinokosha kuti tidzidzise vana vedu kuremekedza vamwe .\n", "Kana kuti muchatsungirira , sezvakaita Josefa ? ”\n", "Iye uchava mukuru uye achanzi Mwanakomana woWokumusoro - soro . ”\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "epeCydmCyS8X" }, "source": [ "\n", "\n", "---\n", "\n", "\n", "## Installation of JoeyNMT\n", "\n", "JoeyNMT is a simple, minimalist NMT package which is useful for learning and teaching. Check out the documentation for JoeyNMT [here](https://joeynmt.readthedocs.io) " ] }, { "cell_type": "code", "metadata": { "id": "iBRMm4kMxZ8L", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "d14bf916-36f3-4325-fa93-0221873f2627" }, "source": [ "# Install JoeyNMT\n", "! git clone https://github.com/joeynmt/joeynmt.git\n", "! cd joeynmt; pip3 install .\n", "# Install Pytorch with GPU support v1.7.1.\n", "! pip install torch==1.8.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "Cloning into 'joeynmt'...\n", "remote: Enumerating objects: 3089, done.\u001b[K\n", "remote: Counting objects: 100% (138/138), done.\u001b[K\n", "remote: Compressing objects: 100% (102/102), done.\u001b[K\n", "remote: Total 3089 (delta 77), reused 74 (delta 36), pack-reused 2951\u001b[K\n", "Receiving objects: 100% (3089/3089), 8.08 MiB | 7.77 MiB/s, done.\n", "Resolving deltas: 100% (2105/2105), done.\n", "Processing /content/joeynmt\n", "Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from joeynmt==1.3) (0.16.0)\n", "Requirement already satisfied: pillow in /usr/local/lib/python3.7/dist-packages (from joeynmt==1.3) (7.1.2)\n", "Collecting numpy==1.20.1\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/70/8a/064b4077e3d793f877e3b77aa64f56fa49a4d37236a53f78ee28be009a16/numpy-1.20.1-cp37-cp37m-manylinux2010_x86_64.whl (15.3MB)\n", "\u001b[K |████████████████████████████████| 15.3MB 212kB/s \n", "\u001b[?25hRequirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.7/dist-packages (from joeynmt==1.3) (56.1.0)\n", "Collecting torch==1.8.0\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/94/99/5861239a6e1ffe66e120f114a4d67e96e5c4b17c1a785dfc6ca6769585fc/torch-1.8.0-cp37-cp37m-manylinux1_x86_64.whl (735.5MB)\n", "\u001b[K |████████████████████████████████| 735.5MB 24kB/s \n", "\u001b[?25hRequirement already satisfied: tensorboard>=1.15 in /usr/local/lib/python3.7/dist-packages (from joeynmt==1.3) (2.4.1)\n", "Collecting torchtext==0.9.0\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/36/50/84184d6230686e230c464f0dd4ff32eada2756b4a0b9cefec68b88d1d580/torchtext-0.9.0-cp37-cp37m-manylinux1_x86_64.whl (7.1MB)\n", "\u001b[K |████████████████████████████████| 7.1MB 22.3MB/s \n", "\u001b[?25hCollecting sacrebleu>=1.3.6\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/7e/57/0c7ca4e31a126189dab99c19951910bd081dea5bbd25f24b77107750eae7/sacrebleu-1.5.1-py3-none-any.whl (54kB)\n", "\u001b[K |████████████████████████████████| 61kB 10.7MB/s \n", "\u001b[?25hCollecting subword-nmt\n", " Downloading https://files.pythonhosted.org/packages/74/60/6600a7bc09e7ab38bc53a48a20d8cae49b837f93f5842a41fe513a694912/subword_nmt-0.3.7-py2.py3-none-any.whl\n", "Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from joeynmt==1.3) (3.2.2)\n", "Requirement already satisfied: seaborn in /usr/local/lib/python3.7/dist-packages (from joeynmt==1.3) (0.11.1)\n", "Collecting pyyaml>=5.1\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/7a/a5/393c087efdc78091afa2af9f1378762f9821c9c1d7a22c5753fb5ac5f97a/PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl (636kB)\n", "\u001b[K |████████████████████████████████| 645kB 11.0MB/s \n", "\u001b[?25hCollecting pylint\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/10/f0/9705d6ec002876bc20b6923cbdeeca82569a895fc214211562580e946079/pylint-2.8.2-py3-none-any.whl (357kB)\n", "\u001b[K |████████████████████████████████| 358kB 45.5MB/s \n", "\u001b[?25hCollecting six==1.12\n", " Downloading https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl\n", "Collecting wrapt==1.11.1\n", " Downloading https://files.pythonhosted.org/packages/67/b2/0f71ca90b0ade7fad27e3d20327c996c6252a2ffe88f50a95bba7434eda9/wrapt-1.11.1.tar.gz\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch==1.8.0->joeynmt==1.3) (3.7.4.3)\n", "Requirement already satisfied: protobuf>=3.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (3.12.4)\n", "Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (0.12.0)\n", "Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (1.0.1)\n", "Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (3.3.4)\n", "Requirement already satisfied: grpcio>=1.24.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (1.32.0)\n", "Requirement already satisfied: google-auth<2,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (1.28.1)\n", "Requirement already satisfied: wheel>=0.26; python_version >= \"3\" in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (0.36.2)\n", "Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (2.23.0)\n", "Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (1.8.0)\n", "Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard>=1.15->joeynmt==1.3) (0.4.4)\n", "Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from torchtext==0.9.0->joeynmt==1.3) (4.41.1)\n", "Collecting portalocker==2.0.0\n", " Downloading https://files.pythonhosted.org/packages/89/a6/3814b7107e0788040870e8825eebf214d72166adf656ba7d4bf14759a06a/portalocker-2.0.0-py2.py3-none-any.whl\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->joeynmt==1.3) (1.3.1)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->joeynmt==1.3) (2.4.7)\n", "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->joeynmt==1.3) (0.10.0)\n", "Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->joeynmt==1.3) (2.8.1)\n", "Requirement already satisfied: pandas>=0.23 in /usr/local/lib/python3.7/dist-packages (from seaborn->joeynmt==1.3) (1.1.5)\n", "Requirement already satisfied: scipy>=1.0 in /usr/local/lib/python3.7/dist-packages (from seaborn->joeynmt==1.3) (1.4.1)\n", "Collecting isort<6,>=4.2.5\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/d9/47/0ec3ec948b7b3a0ba44e62adede4dca8b5985ba6aaee59998bed0916bd17/isort-5.8.0-py3-none-any.whl (103kB)\n", "\u001b[K |████████████████████████████████| 112kB 58.4MB/s \n", "\u001b[?25hCollecting mccabe<0.7,>=0.6\n", " Downloading https://files.pythonhosted.org/packages/87/89/479dc97e18549e21354893e4ee4ef36db1d237534982482c3681ee6e7b57/mccabe-0.6.1-py2.py3-none-any.whl\n", "Collecting astroid<2.7,>=2.5.6\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/f8/82/a61df6c2d68f3ae3ad1afa0d2e5ba5cfb7386eb80cffb453def7c5757271/astroid-2.5.6-py3-none-any.whl (219kB)\n", "\u001b[K |████████████████████████████████| 225kB 61.1MB/s \n", "\u001b[?25hRequirement already satisfied: toml>=0.7.1 in /usr/local/lib/python3.7/dist-packages (from pylint->joeynmt==1.3) (0.10.2)\n", "Requirement already satisfied: importlib-metadata; python_version < \"3.8\" in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard>=1.15->joeynmt==1.3) (3.10.1)\n", "Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<2,>=1.6.3->tensorboard>=1.15->joeynmt==1.3) (0.2.8)\n", "Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<2,>=1.6.3->tensorboard>=1.15->joeynmt==1.3) (4.2.1)\n", "Requirement already satisfied: rsa<5,>=3.1.4; python_version >= \"3.6\" in /usr/local/lib/python3.7/dist-packages (from google-auth<2,>=1.6.3->tensorboard>=1.15->joeynmt==1.3) (4.7.2)\n", "Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard>=1.15->joeynmt==1.3) (2.10)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard>=1.15->joeynmt==1.3) (2020.12.5)\n", "Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard>=1.15->joeynmt==1.3) (1.24.3)\n", "Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard>=1.15->joeynmt==1.3) (3.0.4)\n", "Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=1.15->joeynmt==1.3) (1.3.0)\n", "Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.7/dist-packages (from pandas>=0.23->seaborn->joeynmt==1.3) (2018.9)\n", "Collecting typed-ast<1.5,>=1.4.0; implementation_name == \"cpython\" and python_version < \"3.8\"\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/65/b3/573d2f1fecbbe8f82a8d08172e938c247f99abe1be3bef3da2efaa3810bf/typed_ast-1.4.3-cp37-cp37m-manylinux1_x86_64.whl (743kB)\n", "\u001b[K |████████████████████████████████| 747kB 51.4MB/s \n", "\u001b[?25hCollecting lazy-object-proxy>=1.4.0\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/6e/b0/f055db25fd68ab4859832a887c8b304274fc12dd5a3f8e83e61250733aeb/lazy_object_proxy-1.6.0-cp37-cp37m-manylinux1_x86_64.whl (55kB)\n", "\u001b[K |████████████████████████████████| 61kB 10.7MB/s \n", "\u001b[?25hRequirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata; python_version < \"3.8\"->markdown>=2.6.8->tensorboard>=1.15->joeynmt==1.3) (3.4.1)\n", "Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<2,>=1.6.3->tensorboard>=1.15->joeynmt==1.3) (0.4.8)\n", "Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard>=1.15->joeynmt==1.3) (3.1.0)\n", "Building wheels for collected packages: joeynmt, wrapt\n", " Building wheel for joeynmt (setup.py) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for joeynmt: filename=joeynmt-1.3-cp37-none-any.whl size=84842 sha256=28de579d131469c8aacc6b5764582f6f8c410c06db05a4dd630878a4c12684b1\n", " Stored in directory: /tmp/pip-ephem-wheel-cache-pgu1_p9b/wheels/db/01/db/751cc9f3e7f6faec127c43644ba250a3ea7ad200594aeda70a\n", " Building wheel for wrapt (setup.py) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for wrapt: filename=wrapt-1.11.1-cp37-cp37m-linux_x86_64.whl size=68384 sha256=51d4efda4a234d3f998aac6456e925c0d37720b5ad4a65cf461f48d8c8cf7467\n", " Stored in directory: /root/.cache/pip/wheels/89/67/41/63cbf0f6ac0a6156588b9587be4db5565f8c6d8ccef98202fc\n", "Successfully built joeynmt wrapt\n", "\u001b[31mERROR: torchvision 0.9.1+cu101 has requirement torch==1.8.1, but you'll have torch 1.8.0 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: tensorflow 2.4.1 has requirement numpy~=1.19.2, but you'll have numpy 1.20.1 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: tensorflow 2.4.1 has requirement six~=1.15.0, but you'll have six 1.12.0 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: tensorflow 2.4.1 has requirement wrapt~=1.12.1, but you'll have wrapt 1.11.1 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: google-colab 1.0.0 has requirement six~=1.15.0, but you'll have six 1.12.0 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: google-api-python-client 1.12.8 has requirement six<2dev,>=1.13.0, but you'll have six 1.12.0 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: google-api-core 1.26.3 has requirement six>=1.13.0, but you'll have six 1.12.0 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: datascience 0.10.6 has requirement folium==0.2.1, but you'll have folium 0.8.3 which is incompatible.\u001b[0m\n", "\u001b[31mERROR: albumentations 0.1.12 has requirement imgaug<0.2.7,>=0.2.5, but you'll have imgaug 0.2.9 which is incompatible.\u001b[0m\n", "Installing collected packages: numpy, torch, torchtext, portalocker, sacrebleu, subword-nmt, pyyaml, isort, mccabe, wrapt, typed-ast, lazy-object-proxy, astroid, pylint, six, joeynmt\n", " Found existing installation: numpy 1.19.5\n", " Uninstalling numpy-1.19.5:\n", " Successfully uninstalled numpy-1.19.5\n", " Found existing installation: torch 1.8.1+cu101\n", " Uninstalling torch-1.8.1+cu101:\n", " Successfully uninstalled torch-1.8.1+cu101\n", " Found existing installation: torchtext 0.9.1\n", " Uninstalling torchtext-0.9.1:\n", " Successfully uninstalled torchtext-0.9.1\n", " Found existing installation: PyYAML 3.13\n", " Uninstalling PyYAML-3.13:\n", " Successfully uninstalled PyYAML-3.13\n", " Found existing installation: wrapt 1.12.1\n", " Uninstalling wrapt-1.12.1:\n", " Successfully uninstalled wrapt-1.12.1\n", " Found existing installation: six 1.15.0\n", " Uninstalling six-1.15.0:\n", " Successfully uninstalled six-1.15.0\n", "Successfully installed astroid-2.5.6 isort-5.8.0 joeynmt-1.3 lazy-object-proxy-1.6.0 mccabe-0.6.1 numpy-1.20.1 portalocker-2.0.0 pylint-2.8.2 pyyaml-5.4.1 sacrebleu-1.5.1 six-1.12.0 subword-nmt-0.3.7 torch-1.8.0 torchtext-0.9.0 typed-ast-1.4.3 wrapt-1.11.1\n", "Looking in links: https://download.pytorch.org/whl/torch_stable.html\n", "Collecting torch==1.8.0+cu101\n", "\u001b[?25l Downloading https://download.pytorch.org/whl/cu101/torch-1.8.0%2Bcu101-cp37-cp37m-linux_x86_64.whl (763.5MB)\n", "\u001b[K |████████████████████████████████| 763.5MB 23kB/s \n", "\u001b[?25hRequirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torch==1.8.0+cu101) (1.20.1)\n", "Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch==1.8.0+cu101) (3.7.4.3)\n", "\u001b[31mERROR: torchvision 0.9.1+cu101 has requirement torch==1.8.1, but you'll have torch 1.8.0+cu101 which is incompatible.\u001b[0m\n", "Installing collected packages: torch\n", " Found existing installation: torch 1.8.0\n", " Uninstalling torch-1.8.0:\n", " Successfully uninstalled torch-1.8.0\n", "Successfully installed torch-1.8.0+cu101\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "AaE77Tcppex9" }, "source": [ "# Preprocessing the Data into Subword BPE Tokens\n", "\n", "- One of the most powerful improvements for agglutinative languages (a feature of most Bantu languages) is using BPE tokenization [ (Sennrich, 2015) ](https://arxiv.org/abs/1508.07909).\n", "\n", "- It was also shown that by optimizing the umber of BPE codes we significantly improve results for low-resourced languages [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021) [(Martinus, 2019)](https://arxiv.org/abs/1906.05685)\n", "\n", "- Below we have the scripts for doing BPE tokenization of our data. We use 4000 tokens as recommended by [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021). You do not need to change anything. Simply running the below will be suitable. " ] }, { "cell_type": "code", "metadata": { "id": "H-TyjtmXB1mL", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "03962095-1d32-4b22-b5a5-dcfcb46a6db1" }, "source": [ "# One of the huge boosts in NMT performance was to use a different method of tokenizing. \n", "# Usually, NMT would tokenize by words. However, using a method called BPE gave amazing boosts to performance\n", "\n", "# Do subword NMT\n", "from os import path\n", "os.environ[\"src\"] = source_language # Sets them in bash as well, since we often use bash scripts\n", "os.environ[\"tgt\"] = target_language\n", "\n", "# Learn BPEs on the training data.\n", "os.environ[\"data_path\"] = path.join(\"joeynmt\", \"data\",target_language + source_language ) # Herman! \n", "! subword-nmt learn-joint-bpe-and-vocab --input train.$src train.$tgt -s 4000 -o bpe.codes.4000 --write-vocabulary vocab.$src vocab.$tgt\n", "\n", "# Apply BPE splits to the development and test data.\n", "! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < train.$src > train.bpe.$src\n", "! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < train.$tgt > train.bpe.$tgt\n", "\n", "! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < dev.$src > dev.bpe.$src\n", "! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < dev.$tgt > dev.bpe.$tgt\n", "! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < test.$src > test.bpe.$src\n", "! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < test.$tgt > test.bpe.$tgt\n", "\n", "# Create directory, move everyone we care about to the correct location\n", "! mkdir -p $data_path\n", "! cp train.* $data_path\n", "! cp test.* $data_path\n", "! cp dev.* $data_path\n", "! cp bpe.codes.4000 $data_path\n", "! ls $data_path\n", "\n", "# Also move everything we care about to a mounted location in google drive (relevant if running in colab) at gdrive_path\n", "! cp train.* \"$gdrive_path\"\n", "! cp test.* \"$gdrive_path\"\n", "! cp dev.* \"$gdrive_path\"\n", "! cp bpe.codes.4000 \"$gdrive_path\"\n", "! ls \"$gdrive_path\"\n", "\n", "# Create that vocab using build_vocab\n", "# ! sudo chmod 777 joeynmt/scripts/build_vocab.py\n", "# ! joeynmt/scripts/build_vocab.py joeynmt/data/$tgt$src/train.bpe.$src joeynmt/data/$src$tgt/train.bpe.$tgt --output_path joeynmt/data/$src$tgt/vocab.txt\n", "\n", "\n", "! sudo chmod 777 joeynmt/scripts/build_vocab.py\n", "! joeynmt/scripts/build_vocab.py joeynmt/data/$tgt$src/train.bpe.$src joeynmt/data/$tgt$src/train.bpe.$tgt --output_path joeynmt/data/$tgt$src/vocab.txt\n", "# Some output\n", "! echo \"BPE Shona Sentences\"\n", "! tail -n 5 test.bpe.$tgt\n", "! echo \"Combined BPE Vocab\"\n", "! tail -n 10 joeynmt/data/$tgt$src/vocab.txt # Herman" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "bpe.codes.4000\tdev.en\t test.bpe.sn test.sn\t train.en\n", "dev.bpe.en\tdev.sn\t test.en\t train.bpe.en train.sn\n", "dev.bpe.sn\ttest.bpe.en test.en-any.en train.bpe.sn\n", "bpe.codes.4000\tdev.en\t test.bpe.sn test.sn\t train.en\n", "dev.bpe.en\tdev.sn\t test.en\t train.bpe.en train.sn\n", "dev.bpe.sn\ttest.bpe.en test.en-any.en train.bpe.sn\n", "BPE Shona Sentences\n", "N@@ ho@@ o huru yo@@ kutenda ( Ona ndima 12 - 14 )\n", "N@@ go@@ wan@@ i yor@@ up@@ on@@ eso ( Ona ndima 15 - 18 )\n", "Ndaka@@ ona kuti vanhu vano@@ wanz@@ ot@@ eerera kana vaka@@ ona kuti un@@ ony@@ ats@@ ot@@ aura zviri muBhaibheri ne@@ chi@@ do uye kuti uri ku@@ edza zv@@ ese zva@@ unogona kuti u@@ vab@@ ats@@ ire . ”\n", "B@@ akat@@ wa rem@@ weya ( Ona ndima 19 - 20 )\n", "T@@ ich@@ ib@@ atsirwa naJehovha tinogona kum@@ ira t@@ akasimba pa@@ kur@@ wisana naye .\n", "Combined BPE Vocab\n", "☒\n", "̆\n", "Ā@@\n", "×\n", "muchira\n", "ι\n", "▲\n", "̀@@\n", "❍\n", "›\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "IlMitUHR8Qy-", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "3e254f84-0061-4bab-c177-a33d205ed95c" }, "source": [ "# Also move everything we care about to a mounted location in google drive (relevant if running in colab) at gdrive_path\n", "! cp train.* \"$gdrive_path\"\n", "! cp test.* \"$gdrive_path\"\n", "! cp dev.* \"$gdrive_path\"\n", "! cp bpe.codes.4000 \"$gdrive_path\"\n", "! ls \"$gdrive_path\"" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "bpe.codes.4000\tdev.en\t test.bpe.sn test.sn\t train.en\n", "dev.bpe.en\tdev.sn\t test.en\t train.bpe.en train.sn\n", "dev.bpe.sn\ttest.bpe.en test.en-any.en train.bpe.sn\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "Ixmzi60WsUZ8" }, "source": [ "# Creating the JoeyNMT Config\n", "\n", "JoeyNMT requires a yaml config. We provide a template below. We've also set a number of defaults with it, that you may play with!\n", "\n", "- We used Transformer architecture \n", "- We set our dropout to reasonably high: 0.3 (recommended in [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021))\n", "\n", "Things worth playing with:\n", "- The batch size (also recommended to change for low-resourced languages)\n", "- The number of epochs (we've set it at 30 just so it runs in about an hour, for testing purposes)\n", "- The decoder options (beam_size, alpha)\n", "- Evaluation metrics (BLEU versus Crhf4)" ] }, { "cell_type": "code", "metadata": { "id": "h8TMgv1p3L1z" }, "source": [ "# This creates the config file for our JoeyNMT system. It might seem overwhelming so we've provided a couple of useful parameters you'll need to update\n", "# (You can of course play with all the parameters if you'd like!)\n", "\n", "name = '%s%s' % (target_language, source_language)\n", "# gdrive_path = os.environ[\"gdrive_path\"]\n", "\n", "# Create the config\n", "config = \"\"\"\n", "name: \"{target_language}{source_language}_reverse_transformer\"\n", "\n", "data:\n", " src: \"{target_language}\"\n", " trg: \"{source_language}\"\n", " train: \"data/{name}/train.bpe\"\n", " dev: \"data/{name}/dev.bpe\"\n", " test: \"data/{name}/test.bpe\"\n", " level: \"bpe\"\n", " lowercase: False\n", " max_sent_length: 100\n", " src_vocab: \"data/{name}/vocab.txt\"\n", " trg_vocab: \"data/{name}/vocab.txt\"\n", "\n", "testing:\n", " beam_size: 5\n", " alpha: 1.0\n", "\n", "training:\n", " #load_model: \"{gdrive_path}/models/{name}_transformer/1.ckpt\" # if uncommented, load a pre-trained model from this checkpoint\n", " random_seed: 42\n", " optimizer: \"adam\"\n", " normalization: \"tokens\"\n", " adam_betas: [0.9, 0.999] \n", " scheduling: \"plateau\" # TODO: try switching from plateau to Noam scheduling\n", " patience: 5 # For plateau: decrease learning rate by decrease_factor if validation score has not improved for this many validation rounds.\n", " learning_rate_factor: 0.5 # factor for Noam scheduler (used with Transformer)\n", " learning_rate_warmup: 1000 # warmup steps for Noam scheduler (used with Transformer)\n", " decrease_factor: 0.7\n", " loss: \"crossentropy\"\n", " learning_rate: 0.0003\n", " learning_rate_min: 0.00000001\n", " weight_decay: 0.0\n", " label_smoothing: 0.1\n", " batch_size: 4096\n", " batch_type: \"token\"\n", " eval_batch_size: 3600\n", " eval_batch_type: \"token\"\n", " batch_multiplier: 1\n", " early_stopping_metric: \"ppl\"\n", " epochs: 5 # TODO: Decrease for when playing around and checking of working. Around 30 is sufficient to check if its working at all\n", " validation_freq: 1000 # TODO: Set to at least once per epoch.\n", " logging_freq: 100\n", " eval_metric: \"bleu\"\n", " model_dir: \"models/{name}_reverse_transformer\"\n", " overwrite: True # TODO: Set to True if you want to overwrite possibly existing models. \n", " shuffle: True\n", " use_cuda: True\n", " max_output_length: 100\n", " print_valid_sents: [0, 1, 2, 3]\n", " keep_last_ckpts: 3\n", "\n", "model:\n", " initializer: \"xavier\"\n", " bias_initializer: \"zeros\"\n", " init_gain: 1.0\n", " embed_initializer: \"xavier\"\n", " embed_init_gain: 1.0\n", " tied_embeddings: True\n", " tied_softmax: True\n", " encoder:\n", " type: \"transformer\"\n", " num_layers: 6\n", " num_heads: 4 # TODO: Increase to 8 for larger data.\n", " embeddings:\n", " embedding_dim: 256 # TODO: Increase to 512 for larger data.\n", " scale: True\n", " dropout: 0.2\n", " # typically ff_size = 4 x hidden_size\n", " hidden_size: 256 # TODO: Increase to 512 for larger data.\n", " ff_size: 1024 # TODO: Increase to 2048 for larger data.\n", " dropout: 0.3\n", " decoder:\n", " type: \"transformer\"\n", " num_layers: 6\n", " num_heads: 4 # TODO: Increase to 8 for larger data.\n", " embeddings:\n", " embedding_dim: 256 # TODO: Increase to 512 for larger data.\n", " scale: True\n", " dropout: 0.2\n", " # typically ff_size = 4 x hidden_size\n", " hidden_size: 256 # TODO: Increase to 512 for larger data.\n", " ff_size: 1024 # TODO: Increase to 2048 for larger data.\n", " dropout: 0.3\n", "\"\"\".format(name=name, gdrive_path=os.environ[\"gdrive_path\"], source_language=source_language, target_language=target_language)\n", "with open(\"joeynmt/configs/transformer_reverse_{name}.yaml\".format(name=name),'w') as f:\n", " f.write(config)" ], "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "oEzoJtV2MIpt" }, "source": [ "# Train the Model\n", "\n", "This single line of joeynmt runs the training using the config we made above" ] }, { "cell_type": "code", "metadata": { "id": "WzbNYNdjLgNb", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "6998482d-df56-4b42-e901-6cbce99fc9e2" }, "source": [ "# Train the model\n", "# You can press Ctrl-C to stop. And then run the next cell to save your checkpoints! \n", "!cd joeynmt; python3 -m joeynmt train configs/transformer_reverse_$tgt$src.yaml" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "2021-05-08 16:12:27,880 - INFO - root - Hello! This is Joey-NMT (version 1.3).\n", "2021-05-08 16:12:27,920 - INFO - joeynmt.data - Loading training data...\n", "2021-05-08 16:12:41,533 - INFO - joeynmt.data - Building vocabulary...\n", "2021-05-08 16:12:41,808 - INFO - joeynmt.data - Loading dev data...\n", "2021-05-08 16:12:41,863 - INFO - joeynmt.data - Loading test data...\n", "2021-05-08 16:12:41,884 - INFO - joeynmt.data - Data loaded.\n", "2021-05-08 16:12:41,884 - INFO - joeynmt.model - Building an encoder-decoder model...\n", "2021-05-08 16:12:42,238 - INFO - joeynmt.model - Enc-dec model built.\n", "2021-05-08 16:12:42.499600: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0\n", "2021-05-08 16:12:44,193 - INFO - joeynmt.training - Total params: 12196608\n", "2021-05-08 16:12:48,394 - INFO - joeynmt.helpers - cfg.name : snen_reverse_transformer\n", "2021-05-08 16:12:48,394 - INFO - joeynmt.helpers - cfg.data.src : sn\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.trg : en\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.train : data/snen/train.bpe\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.dev : data/snen/dev.bpe\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.test : data/snen/test.bpe\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.level : bpe\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.lowercase : False\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.max_sent_length : 100\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.src_vocab : data/snen/vocab.txt\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.data.trg_vocab : data/snen/vocab.txt\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.testing.beam_size : 5\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.testing.alpha : 1.0\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.training.random_seed : 42\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.training.optimizer : adam\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.training.normalization : tokens\n", "2021-05-08 16:12:48,395 - INFO - joeynmt.helpers - cfg.training.adam_betas : [0.9, 0.999]\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.scheduling : plateau\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.patience : 5\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.learning_rate_factor : 0.5\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.learning_rate_warmup : 1000\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.decrease_factor : 0.7\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.loss : crossentropy\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.learning_rate : 0.0003\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.learning_rate_min : 1e-08\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.weight_decay : 0.0\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.label_smoothing : 0.1\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.batch_size : 4096\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.batch_type : token\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.eval_batch_size : 3600\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.eval_batch_type : token\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.batch_multiplier : 1\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.early_stopping_metric : ppl\n", "2021-05-08 16:12:48,396 - INFO - joeynmt.helpers - cfg.training.epochs : 5\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.validation_freq : 1000\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.logging_freq : 100\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.eval_metric : bleu\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.model_dir : models/snen_reverse_transformer\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.overwrite : True\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.shuffle : True\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.use_cuda : True\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.max_output_length : 100\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.print_valid_sents : [0, 1, 2, 3]\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.training.keep_last_ckpts : 3\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.model.initializer : xavier\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.model.bias_initializer : zeros\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.model.init_gain : 1.0\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.model.embed_initializer : xavier\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.model.embed_init_gain : 1.0\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.model.tied_embeddings : True\n", "2021-05-08 16:12:48,397 - INFO - joeynmt.helpers - cfg.model.tied_softmax : True\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.type : transformer\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.num_layers : 6\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.num_heads : 4\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.embeddings.embedding_dim : 256\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.embeddings.scale : True\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.embeddings.dropout : 0.2\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.hidden_size : 256\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.ff_size : 1024\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.encoder.dropout : 0.3\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.decoder.type : transformer\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.decoder.num_layers : 6\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.decoder.num_heads : 4\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.decoder.embeddings.embedding_dim : 256\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.decoder.embeddings.scale : True\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.decoder.embeddings.dropout : 0.2\n", "2021-05-08 16:12:48,398 - INFO - joeynmt.helpers - cfg.model.decoder.hidden_size : 256\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - cfg.model.decoder.ff_size : 1024\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - cfg.model.decoder.dropout : 0.3\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - Data set sizes: \n", "\ttrain 712359,\n", "\tvalid 1000,\n", "\ttest 2723\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - First training example:\n", "\t[SRC] ◆ Kur@@ atidza chib@@ ereko cho@@ mudzimu\n", "\t[TRG] ◆ M@@ an@@ if@@ es@@ ting the fru@@ it@@ age of the spirit\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - First 10 words (src): (0) (1) (2) (3) (4) . (5) , (6) the (7) to (8) of (9) “\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - First 10 words (trg): (0) (1) (2) (3) (4) . (5) , (6) the (7) to (8) of (9) “\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - Number of Src words (types): 4439\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.helpers - Number of Trg words (types): 4439\n", "2021-05-08 16:12:48,399 - INFO - joeynmt.training - Model(\n", "\tencoder=TransformerEncoder(num_layers=6, num_heads=4),\n", "\tdecoder=TransformerDecoder(num_layers=6, num_heads=4),\n", "\tsrc_embed=Embeddings(embedding_dim=256, vocab_size=4439),\n", "\ttrg_embed=Embeddings(embedding_dim=256, vocab_size=4439))\n", "2021-05-08 16:12:48,403 - INFO - joeynmt.training - Train stats:\n", "\tdevice: cuda\n", "\tn_gpu: 1\n", "\t16-bits training: False\n", "\tgradient accumulation: 1\n", "\tbatch size per device: 4096\n", "\ttotal batch size (w. parallel & accumulation): 4096\n", "2021-05-08 16:12:48,403 - INFO - joeynmt.training - EPOCH 1\n", "2021-05-08 16:13:03,065 - INFO - joeynmt.training - Epoch 1, Step: 100, Batch Loss: 5.627021, Tokens per Sec: 17206, Lr: 0.000300\n", "2021-05-08 16:13:16,523 - INFO - joeynmt.training - Epoch 1, Step: 200, Batch Loss: 5.575126, Tokens per Sec: 19050, Lr: 0.000300\n", "2021-05-08 16:13:29,939 - INFO - joeynmt.training - Epoch 1, Step: 300, Batch Loss: 5.379350, Tokens per Sec: 18801, Lr: 0.000300\n", "2021-05-08 16:13:43,617 - INFO - joeynmt.training - Epoch 1, Step: 400, Batch Loss: 5.183380, Tokens per Sec: 18685, Lr: 0.000300\n", "2021-05-08 16:13:57,284 - INFO - joeynmt.training - Epoch 1, Step: 500, Batch Loss: 5.150495, Tokens per Sec: 18403, Lr: 0.000300\n", "2021-05-08 16:14:10,845 - INFO - joeynmt.training - Epoch 1, Step: 600, Batch Loss: 4.644405, Tokens per Sec: 18215, Lr: 0.000300\n", "2021-05-08 16:14:24,625 - INFO - joeynmt.training - Epoch 1, Step: 700, Batch Loss: 4.926203, Tokens per Sec: 18552, Lr: 0.000300\n", "2021-05-08 16:14:38,483 - INFO - joeynmt.training - Epoch 1, Step: 800, Batch Loss: 4.695370, Tokens per Sec: 18387, Lr: 0.000300\n", "2021-05-08 16:14:52,596 - INFO - joeynmt.training - Epoch 1, Step: 900, Batch Loss: 4.831143, Tokens per Sec: 18413, Lr: 0.000300\n", "2021-05-08 16:15:06,577 - INFO - joeynmt.training - Epoch 1, Step: 1000, Batch Loss: 4.305650, Tokens per Sec: 18166, Lr: 0.000300\n", "2021-05-08 16:15:45,130 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:15:45,130 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:15:45,130 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:15:45,377 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:15:45,377 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:15:45,929 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - \tHypothesis: The Bible is a Bible is a Bible .\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - \tHypothesis: How can not be a way of the Bible ?\n", "2021-05-08 16:15:45,930 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - \tHypothesis: The Bible was a Bible of the Bible of the Bible , and the Bible was a Bible , and the Bible , and the Bible is a Bible , and the Bible is a Bible is a Bible .\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - \tHypothesis: The Saring\n", "2021-05-08 16:15:45,931 - INFO - joeynmt.training - Validation result (greedy) at epoch 1, step 1000: bleu: 1.28, loss: 128584.3828, ppl: 82.9889, duration: 39.3540s\n", "2021-05-08 16:15:59,833 - INFO - joeynmt.training - Epoch 1, Step: 1100, Batch Loss: 4.281054, Tokens per Sec: 17730, Lr: 0.000300\n", "2021-05-08 16:16:13,898 - INFO - joeynmt.training - Epoch 1, Step: 1200, Batch Loss: 4.360102, Tokens per Sec: 17924, Lr: 0.000300\n", "2021-05-08 16:16:28,122 - INFO - joeynmt.training - Epoch 1, Step: 1300, Batch Loss: 4.400066, Tokens per Sec: 17914, Lr: 0.000300\n", "2021-05-08 16:16:42,060 - INFO - joeynmt.training - Epoch 1, Step: 1400, Batch Loss: 4.372830, Tokens per Sec: 17337, Lr: 0.000300\n", "2021-05-08 16:16:56,173 - INFO - joeynmt.training - Epoch 1, Step: 1500, Batch Loss: 4.166342, Tokens per Sec: 17733, Lr: 0.000300\n", "2021-05-08 16:17:10,224 - INFO - joeynmt.training - Epoch 1, Step: 1600, Batch Loss: 4.178030, Tokens per Sec: 17668, Lr: 0.000300\n", "2021-05-08 16:17:24,525 - INFO - joeynmt.training - Epoch 1, Step: 1700, Batch Loss: 3.803659, Tokens per Sec: 18064, Lr: 0.000300\n", "2021-05-08 16:17:38,819 - INFO - joeynmt.training - Epoch 1, Step: 1800, Batch Loss: 4.162997, Tokens per Sec: 17736, Lr: 0.000300\n", "2021-05-08 16:17:53,055 - INFO - joeynmt.training - Epoch 1, Step: 1900, Batch Loss: 3.989771, Tokens per Sec: 17850, Lr: 0.000300\n", "2021-05-08 16:18:07,342 - INFO - joeynmt.training - Epoch 1, Step: 2000, Batch Loss: 3.531524, Tokens per Sec: 17908, Lr: 0.000300\n", "2021-05-08 16:18:52,059 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:18:52,059 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:18:52,059 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:18:52,337 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:18:52,337 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:18:52,714 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:18:52,714 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:18:52,714 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tHypothesis: It is not not not not to be a good .\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tHypothesis: How did Jehovah have the way of the world , and what is the way ?\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tHypothesis: The apostle Paul was a few century of the first century of the first century , the other day of the world , and the same , and the same time , and they were not not not not be a person who had been been been been been been been been been been been been been been been been been been been been been been been a .\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:18:52,715 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:18:52,716 - INFO - joeynmt.training - \tHypothesis: The Desvice of the Des’s Felal\n", "2021-05-08 16:18:52,716 - INFO - joeynmt.training - Validation result (greedy) at epoch 1, step 2000: bleu: 1.82, loss: 110739.4844, ppl: 44.9468, duration: 45.3734s\n", "2021-05-08 16:19:07,053 - INFO - joeynmt.training - Epoch 1, Step: 2100, Batch Loss: 3.748961, Tokens per Sec: 17514, Lr: 0.000300\n", "2021-05-08 16:19:21,415 - INFO - joeynmt.training - Epoch 1, Step: 2200, Batch Loss: 3.728178, Tokens per Sec: 17725, Lr: 0.000300\n", "2021-05-08 16:19:35,969 - INFO - joeynmt.training - Epoch 1, Step: 2300, Batch Loss: 3.945595, Tokens per Sec: 17532, Lr: 0.000300\n", "2021-05-08 16:19:50,259 - INFO - joeynmt.training - Epoch 1, Step: 2400, Batch Loss: 3.657610, Tokens per Sec: 17499, Lr: 0.000300\n", "2021-05-08 16:20:04,730 - INFO - joeynmt.training - Epoch 1, Step: 2500, Batch Loss: 3.704163, Tokens per Sec: 17728, Lr: 0.000300\n", "2021-05-08 16:20:19,066 - INFO - joeynmt.training - Epoch 1, Step: 2600, Batch Loss: 3.512891, Tokens per Sec: 17416, Lr: 0.000300\n", "2021-05-08 16:20:33,270 - INFO - joeynmt.training - Epoch 1, Step: 2700, Batch Loss: 3.307571, Tokens per Sec: 17585, Lr: 0.000300\n", "2021-05-08 16:20:47,501 - INFO - joeynmt.training - Epoch 1, Step: 2800, Batch Loss: 3.502509, Tokens per Sec: 17504, Lr: 0.000300\n", "2021-05-08 16:21:01,910 - INFO - joeynmt.training - Epoch 1, Step: 2900, Batch Loss: 3.523389, Tokens per Sec: 17838, Lr: 0.000300\n", "2021-05-08 16:21:16,262 - INFO - joeynmt.training - Epoch 1, Step: 3000, Batch Loss: 3.793230, Tokens per Sec: 17707, Lr: 0.000300\n", "2021-05-08 16:21:44,237 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:21:44,237 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:21:44,237 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:21:44,493 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:21:44,494 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:21:44,881 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tHypothesis: It is not not not a full - time time time .\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tHypothesis: How did Jehovah give people people and what ?\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - \tHypothesis: The first - century - year - year - old - old , he was a stater , and the good news of the good news , and he was a good news of the way .\n", "2021-05-08 16:21:44,882 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:21:44,883 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:21:44,883 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:21:44,883 - INFO - joeynmt.training - \tHypothesis: A Spect of Life\n", "2021-05-08 16:21:44,883 - INFO - joeynmt.training - Validation result (greedy) at epoch 1, step 3000: bleu: 4.49, loss: 97636.2266, ppl: 28.6512, duration: 28.6206s\n", "2021-05-08 16:21:59,173 - INFO - joeynmt.training - Epoch 1, Step: 3100, Batch Loss: 3.461280, Tokens per Sec: 17792, Lr: 0.000300\n", "2021-05-08 16:22:13,518 - INFO - joeynmt.training - Epoch 1, Step: 3200, Batch Loss: 3.490087, Tokens per Sec: 17695, Lr: 0.000300\n", "2021-05-08 16:22:27,797 - INFO - joeynmt.training - Epoch 1, Step: 3300, Batch Loss: 3.312736, Tokens per Sec: 17740, Lr: 0.000300\n", "2021-05-08 16:22:42,107 - INFO - joeynmt.training - Epoch 1, Step: 3400, Batch Loss: 3.146439, Tokens per Sec: 17490, Lr: 0.000300\n", "2021-05-08 16:22:56,407 - INFO - joeynmt.training - Epoch 1, Step: 3500, Batch Loss: 3.330356, Tokens per Sec: 17463, Lr: 0.000300\n", "2021-05-08 16:23:10,955 - INFO - joeynmt.training - Epoch 1, Step: 3600, Batch Loss: 3.877379, Tokens per Sec: 17810, Lr: 0.000300\n", "2021-05-08 16:23:25,241 - INFO - joeynmt.training - Epoch 1, Step: 3700, Batch Loss: 3.400492, Tokens per Sec: 17232, Lr: 0.000300\n", "2021-05-08 16:23:39,673 - INFO - joeynmt.training - Epoch 1, Step: 3800, Batch Loss: 3.133973, Tokens per Sec: 17918, Lr: 0.000300\n", "2021-05-08 16:23:53,891 - INFO - joeynmt.training - Epoch 1, Step: 3900, Batch Loss: 3.439733, Tokens per Sec: 17405, Lr: 0.000300\n", "2021-05-08 16:24:08,126 - INFO - joeynmt.training - Epoch 1, Step: 4000, Batch Loss: 3.416281, Tokens per Sec: 17467, Lr: 0.000300\n", "2021-05-08 16:24:40,040 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:24:40,040 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:24:40,040 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:24:40,293 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:24:40,293 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:24:40,736 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tHypothesis: He is not not to be a joice .\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tHypothesis: How did Jehovah give people people to be the people and why ?\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:24:40,737 - INFO - joeynmt.training - \tHypothesis: The first century of John and the founds of the founded , the angels , and the false leaders were not given to be a type of the same way .\n", "2021-05-08 16:24:40,738 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:24:40,738 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:24:40,738 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:24:40,738 - INFO - joeynmt.training - \tHypothesis: Have You Have Fap\n", "2021-05-08 16:24:40,738 - INFO - joeynmt.training - Validation result (greedy) at epoch 1, step 4000: bleu: 7.47, loss: 88936.7031, ppl: 21.2476, duration: 32.6113s\n", "2021-05-08 16:24:54,941 - INFO - joeynmt.training - Epoch 1, Step: 4100, Batch Loss: 3.274220, Tokens per Sec: 17667, Lr: 0.000300\n", "2021-05-08 16:25:09,259 - INFO - joeynmt.training - Epoch 1, Step: 4200, Batch Loss: 3.354711, Tokens per Sec: 17671, Lr: 0.000300\n", "2021-05-08 16:25:23,612 - INFO - joeynmt.training - Epoch 1, Step: 4300, Batch Loss: 3.441336, Tokens per Sec: 17894, Lr: 0.000300\n", "2021-05-08 16:25:37,901 - INFO - joeynmt.training - Epoch 1, Step: 4400, Batch Loss: 2.641688, Tokens per Sec: 17733, Lr: 0.000300\n", "2021-05-08 16:25:52,173 - INFO - joeynmt.training - Epoch 1, Step: 4500, Batch Loss: 2.732913, Tokens per Sec: 17583, Lr: 0.000300\n", "2021-05-08 16:26:06,619 - INFO - joeynmt.training - Epoch 1, Step: 4600, Batch Loss: 3.307412, Tokens per Sec: 17777, Lr: 0.000300\n", "2021-05-08 16:26:20,802 - INFO - joeynmt.training - Epoch 1, Step: 4700, Batch Loss: 2.760661, Tokens per Sec: 17520, Lr: 0.000300\n", "2021-05-08 16:26:35,056 - INFO - joeynmt.training - Epoch 1, Step: 4800, Batch Loss: 2.889900, Tokens per Sec: 17370, Lr: 0.000300\n", "2021-05-08 16:26:49,441 - INFO - joeynmt.training - Epoch 1, Step: 4900, Batch Loss: 2.850283, Tokens per Sec: 17769, Lr: 0.000300\n", "2021-05-08 16:27:03,904 - INFO - joeynmt.training - Epoch 1, Step: 5000, Batch Loss: 2.842918, Tokens per Sec: 17761, Lr: 0.000300\n", "2021-05-08 16:27:27,677 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:27:27,677 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:27:27,677 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:27:27,922 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:27:27,922 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:27:28,381 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:27:28,381 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:27:28,381 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:27:28,381 - INFO - joeynmt.training - \tHypothesis: He did not try to be a work .\n", "2021-05-08 16:27:28,381 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:27:28,381 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:27:28,381 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - \tHypothesis: How did Jehovah give people people ?\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - \tHypothesis: The Gospel John and three sister was given to be sick , and the angels of false , and the angels were not given to be a type of the same .\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - \tHypothesis: Prending to Spure\n", "2021-05-08 16:27:28,382 - INFO - joeynmt.training - Validation result (greedy) at epoch 1, step 5000: bleu: 10.78, loss: 82720.5781, ppl: 17.1609, duration: 24.4784s\n", "2021-05-08 16:27:42,558 - INFO - joeynmt.training - Epoch 1, Step: 5100, Batch Loss: 3.025777, Tokens per Sec: 17527, Lr: 0.000300\n", "2021-05-08 16:27:56,917 - INFO - joeynmt.training - Epoch 1, Step: 5200, Batch Loss: 2.934368, Tokens per Sec: 17823, Lr: 0.000300\n", "2021-05-08 16:28:11,112 - INFO - joeynmt.training - Epoch 1, Step: 5300, Batch Loss: 2.961665, Tokens per Sec: 17568, Lr: 0.000300\n", "2021-05-08 16:28:25,361 - INFO - joeynmt.training - Epoch 1, Step: 5400, Batch Loss: 2.750590, Tokens per Sec: 17615, Lr: 0.000300\n", "2021-05-08 16:28:39,665 - INFO - joeynmt.training - Epoch 1, Step: 5500, Batch Loss: 3.219716, Tokens per Sec: 17708, Lr: 0.000300\n", "2021-05-08 16:28:53,901 - INFO - joeynmt.training - Epoch 1, Step: 5600, Batch Loss: 2.895310, Tokens per Sec: 17701, Lr: 0.000300\n", "2021-05-08 16:29:08,091 - INFO - joeynmt.training - Epoch 1, Step: 5700, Batch Loss: 3.315619, Tokens per Sec: 17786, Lr: 0.000300\n", "2021-05-08 16:29:22,452 - INFO - joeynmt.training - Epoch 1, Step: 5800, Batch Loss: 3.092244, Tokens per Sec: 17830, Lr: 0.000300\n", "2021-05-08 16:29:36,567 - INFO - joeynmt.training - Epoch 1, Step: 5900, Batch Loss: 2.927167, Tokens per Sec: 17591, Lr: 0.000300\n", "2021-05-08 16:29:50,784 - INFO - joeynmt.training - Epoch 1, Step: 6000, Batch Loss: 2.778678, Tokens per Sec: 17672, Lr: 0.000300\n", "2021-05-08 16:30:18,306 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:30:18,306 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:30:18,306 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:30:18,549 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:30:18,549 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - \tHypothesis: He did not try to the work .\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - \tHypothesis: How did Jehovah give people people and why ?\n", "2021-05-08 16:30:18,934 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - \tHypothesis: All of John and three - old - old - old - founded , the false schools , and the potential of the background was made .\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - \tHypothesis: Prich Faith\n", "2021-05-08 16:30:18,935 - INFO - joeynmt.training - Validation result (greedy) at epoch 1, step 6000: bleu: 13.05, loss: 77637.4844, ppl: 14.4105, duration: 28.1511s\n", "2021-05-08 16:30:33,207 - INFO - joeynmt.training - Epoch 1, Step: 6100, Batch Loss: 3.278849, Tokens per Sec: 17521, Lr: 0.000300\n", "2021-05-08 16:30:47,485 - INFO - joeynmt.training - Epoch 1, Step: 6200, Batch Loss: 2.822727, Tokens per Sec: 17445, Lr: 0.000300\n", "2021-05-08 16:31:01,706 - INFO - joeynmt.training - Epoch 1, Step: 6300, Batch Loss: 2.625848, Tokens per Sec: 17725, Lr: 0.000300\n", "2021-05-08 16:31:16,005 - INFO - joeynmt.training - Epoch 1, Step: 6400, Batch Loss: 3.061276, Tokens per Sec: 17691, Lr: 0.000300\n", "2021-05-08 16:31:30,088 - INFO - joeynmt.training - Epoch 1, Step: 6500, Batch Loss: 3.010986, Tokens per Sec: 17683, Lr: 0.000300\n", "2021-05-08 16:31:44,407 - INFO - joeynmt.training - Epoch 1, Step: 6600, Batch Loss: 2.886325, Tokens per Sec: 17678, Lr: 0.000300\n", "2021-05-08 16:31:58,748 - INFO - joeynmt.training - Epoch 1, Step: 6700, Batch Loss: 2.420646, Tokens per Sec: 17763, Lr: 0.000300\n", "2021-05-08 16:32:13,093 - INFO - joeynmt.training - Epoch 1, Step: 6800, Batch Loss: 2.739479, Tokens per Sec: 17852, Lr: 0.000300\n", "2021-05-08 16:32:27,397 - INFO - joeynmt.training - Epoch 1, Step: 6900, Batch Loss: 2.776512, Tokens per Sec: 17750, Lr: 0.000300\n", "2021-05-08 16:32:41,726 - INFO - joeynmt.training - Epoch 1, Step: 7000, Batch Loss: 2.855759, Tokens per Sec: 17655, Lr: 0.000300\n", "2021-05-08 16:33:10,268 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:33:10,268 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:33:10,268 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:33:10,520 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:33:10,521 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:33:10,925 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:33:10,925 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:33:10,925 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:33:10,925 - INFO - joeynmt.training - \tHypothesis: He did not try to try to the work .\n", "2021-05-08 16:33:10,925 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:33:10,925 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:33:10,925 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - \tHypothesis: How does Jehovah not love people and why ?\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - \tHypothesis: All the Gospel John is three wealth and sleep , indicating false , and the baby was given to be sure that the truth is not a sound .\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - \tHypothesis: Happy to Happiness\n", "2021-05-08 16:33:10,926 - INFO - joeynmt.training - Validation result (greedy) at epoch 1, step 7000: bleu: 14.97, loss: 73211.5938, ppl: 12.3773, duration: 29.1996s\n", "2021-05-08 16:33:25,213 - INFO - joeynmt.training - Epoch 1, Step: 7100, Batch Loss: 2.917281, Tokens per Sec: 17815, Lr: 0.000300\n", "2021-05-08 16:33:39,626 - INFO - joeynmt.training - Epoch 1, Step: 7200, Batch Loss: 3.246493, Tokens per Sec: 17775, Lr: 0.000300\n", "2021-05-08 16:33:54,012 - INFO - joeynmt.training - Epoch 1, Step: 7300, Batch Loss: 2.625930, Tokens per Sec: 17404, Lr: 0.000300\n", "2021-05-08 16:34:08,343 - INFO - joeynmt.training - Epoch 1, Step: 7400, Batch Loss: 2.760853, Tokens per Sec: 17728, Lr: 0.000300\n", "2021-05-08 16:34:22,657 - INFO - joeynmt.training - Epoch 1, Step: 7500, Batch Loss: 2.537715, Tokens per Sec: 17579, Lr: 0.000300\n", "2021-05-08 16:34:37,019 - INFO - joeynmt.training - Epoch 1, Step: 7600, Batch Loss: 2.529770, Tokens per Sec: 17682, Lr: 0.000300\n", "2021-05-08 16:34:51,391 - INFO - joeynmt.training - Epoch 1, Step: 7700, Batch Loss: 2.468800, Tokens per Sec: 17735, Lr: 0.000300\n", "2021-05-08 16:35:04,914 - INFO - joeynmt.training - Epoch 1: total training loss 27415.95\n", "2021-05-08 16:35:04,914 - INFO - joeynmt.training - EPOCH 2\n", "2021-05-08 16:35:06,774 - INFO - joeynmt.training - Epoch 2, Step: 7800, Batch Loss: 2.246435, Tokens per Sec: 7369, Lr: 0.000300\n", "2021-05-08 16:35:20,953 - INFO - joeynmt.training - Epoch 2, Step: 7900, Batch Loss: 2.687926, Tokens per Sec: 17711, Lr: 0.000300\n", "2021-05-08 16:35:35,312 - INFO - joeynmt.training - Epoch 2, Step: 8000, Batch Loss: 2.419375, Tokens per Sec: 17764, Lr: 0.000300\n", "2021-05-08 16:36:01,834 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:36:01,835 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:36:01,835 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:36:02,079 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:36:02,079 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:36:02,583 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:36:02,583 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:36:02,583 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:36:02,583 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:36:02,583 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:36:02,583 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:36:02,583 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - \tHypothesis: How did Jehovah not want people to do and why ?\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - \tHypothesis: All the Gospel John is a three - figurative events that was disappointed by the false , and the threatens was given to be a sign of the truth .\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - \tHypothesis: Discuing to Be Happy\n", "2021-05-08 16:36:02,584 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 8000: bleu: 17.02, loss: 70076.2734, ppl: 11.1130, duration: 27.2721s\n", "2021-05-08 16:36:16,886 - INFO - joeynmt.training - Epoch 2, Step: 8100, Batch Loss: 2.936790, Tokens per Sec: 17956, Lr: 0.000300\n", "2021-05-08 16:36:31,065 - INFO - joeynmt.training - Epoch 2, Step: 8200, Batch Loss: 2.986363, Tokens per Sec: 17589, Lr: 0.000300\n", "2021-05-08 16:36:45,203 - INFO - joeynmt.training - Epoch 2, Step: 8300, Batch Loss: 2.691797, Tokens per Sec: 17359, Lr: 0.000300\n", "2021-05-08 16:36:59,506 - INFO - joeynmt.training - Epoch 2, Step: 8400, Batch Loss: 2.588052, Tokens per Sec: 17605, Lr: 0.000300\n", "2021-05-08 16:37:13,621 - INFO - joeynmt.training - Epoch 2, Step: 8500, Batch Loss: 2.778736, Tokens per Sec: 17605, Lr: 0.000300\n", "2021-05-08 16:37:28,078 - INFO - joeynmt.training - Epoch 2, Step: 8600, Batch Loss: 2.782846, Tokens per Sec: 17972, Lr: 0.000300\n", "2021-05-08 16:37:42,438 - INFO - joeynmt.training - Epoch 2, Step: 8700, Batch Loss: 2.583629, Tokens per Sec: 17890, Lr: 0.000300\n", "2021-05-08 16:37:56,626 - INFO - joeynmt.training - Epoch 2, Step: 8800, Batch Loss: 2.672633, Tokens per Sec: 17841, Lr: 0.000300\n", "2021-05-08 16:38:10,951 - INFO - joeynmt.training - Epoch 2, Step: 8900, Batch Loss: 2.674888, Tokens per Sec: 17615, Lr: 0.000300\n", "2021-05-08 16:38:25,248 - INFO - joeynmt.training - Epoch 2, Step: 9000, Batch Loss: 2.609185, Tokens per Sec: 17629, Lr: 0.000300\n", "2021-05-08 16:38:57,723 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:38:57,723 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:38:57,724 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:38:57,967 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:38:57,967 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:38:58,364 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:38:58,364 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:38:58,364 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:38:58,364 - INFO - joeynmt.training - \tHypothesis: He did not try to control the work .\n", "2021-05-08 16:38:58,364 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:38:58,364 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tHypothesis: How did Jehovah mean to do people and why ?\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tHypothesis: The Gospel John is three - old Gospel who was discouraged , indicating false , and the same is given to be given to the right right .\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - \tHypothesis: Do Not Have Not Have\n", "2021-05-08 16:38:58,365 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 9000: bleu: 18.29, loss: 67605.0703, ppl: 10.2083, duration: 33.1169s\n", "2021-05-08 16:39:12,696 - INFO - joeynmt.training - Epoch 2, Step: 9100, Batch Loss: 2.716684, Tokens per Sec: 17768, Lr: 0.000300\n", "2021-05-08 16:39:26,758 - INFO - joeynmt.training - Epoch 2, Step: 9200, Batch Loss: 2.609791, Tokens per Sec: 17476, Lr: 0.000300\n", "2021-05-08 16:39:40,930 - INFO - joeynmt.training - Epoch 2, Step: 9300, Batch Loss: 2.594735, Tokens per Sec: 17565, Lr: 0.000300\n", "2021-05-08 16:39:55,206 - INFO - joeynmt.training - Epoch 2, Step: 9400, Batch Loss: 2.511772, Tokens per Sec: 17568, Lr: 0.000300\n", "2021-05-08 16:40:09,604 - INFO - joeynmt.training - Epoch 2, Step: 9500, Batch Loss: 2.534153, Tokens per Sec: 17840, Lr: 0.000300\n", "2021-05-08 16:40:24,003 - INFO - joeynmt.training - Epoch 2, Step: 9600, Batch Loss: 2.160509, Tokens per Sec: 17958, Lr: 0.000300\n", "2021-05-08 16:40:38,306 - INFO - joeynmt.training - Epoch 2, Step: 9700, Batch Loss: 2.339026, Tokens per Sec: 17670, Lr: 0.000300\n", "2021-05-08 16:40:52,702 - INFO - joeynmt.training - Epoch 2, Step: 9800, Batch Loss: 2.614063, Tokens per Sec: 17749, Lr: 0.000300\n", "2021-05-08 16:41:07,057 - INFO - joeynmt.training - Epoch 2, Step: 9900, Batch Loss: 2.326984, Tokens per Sec: 17873, Lr: 0.000300\n", "2021-05-08 16:41:21,279 - INFO - joeynmt.training - Epoch 2, Step: 10000, Batch Loss: 2.353409, Tokens per Sec: 17651, Lr: 0.000300\n", "2021-05-08 16:41:45,379 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:41:45,379 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:41:45,379 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:41:45,626 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:41:45,627 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:41:46,099 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - \tHypothesis: How did Jehovah not react people and why ?\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:41:46,100 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:41:46,101 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:41:46,101 - INFO - joeynmt.training - \tHypothesis: The whole Gospel John is a third of three silver of the fallen , indicates false , and the same competent was given to be sure .\n", "2021-05-08 16:41:46,101 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:41:46,101 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:41:46,101 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:41:46,101 - INFO - joeynmt.training - \tHypothesis: Do Not Have Forever\n", "2021-05-08 16:41:46,101 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 10000: bleu: 20.08, loss: 65946.9609, ppl: 9.6429, duration: 24.8220s\n", "2021-05-08 16:42:00,443 - INFO - joeynmt.training - Epoch 2, Step: 10100, Batch Loss: 2.410008, Tokens per Sec: 17648, Lr: 0.000300\n", "2021-05-08 16:42:14,892 - INFO - joeynmt.training - Epoch 2, Step: 10200, Batch Loss: 2.760280, Tokens per Sec: 17871, Lr: 0.000300\n", "2021-05-08 16:42:29,051 - INFO - joeynmt.training - Epoch 2, Step: 10300, Batch Loss: 2.185681, Tokens per Sec: 17541, Lr: 0.000300\n", "2021-05-08 16:42:43,364 - INFO - joeynmt.training - Epoch 2, Step: 10400, Batch Loss: 2.510891, Tokens per Sec: 17798, Lr: 0.000300\n", "2021-05-08 16:42:57,617 - INFO - joeynmt.training - Epoch 2, Step: 10500, Batch Loss: 2.942348, Tokens per Sec: 17498, Lr: 0.000300\n", "2021-05-08 16:43:11,721 - INFO - joeynmt.training - Epoch 2, Step: 10600, Batch Loss: 2.328924, Tokens per Sec: 17833, Lr: 0.000300\n", "2021-05-08 16:43:25,992 - INFO - joeynmt.training - Epoch 2, Step: 10700, Batch Loss: 2.586262, Tokens per Sec: 17770, Lr: 0.000300\n", "2021-05-08 16:43:40,236 - INFO - joeynmt.training - Epoch 2, Step: 10800, Batch Loss: 2.650257, Tokens per Sec: 17646, Lr: 0.000300\n", "2021-05-08 16:43:54,594 - INFO - joeynmt.training - Epoch 2, Step: 10900, Batch Loss: 2.210833, Tokens per Sec: 17959, Lr: 0.000300\n", "2021-05-08 16:44:08,808 - INFO - joeynmt.training - Epoch 2, Step: 11000, Batch Loss: 2.244355, Tokens per Sec: 17882, Lr: 0.000300\n", "2021-05-08 16:44:37,399 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:44:37,399 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:44:37,399 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:44:37,644 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:44:37,644 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:44:38,065 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:44:38,065 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:44:38,065 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:44:38,065 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:44:38,065 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tHypothesis: How did Jehovah not want to do people and why ?\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tHypothesis: The Gospel of John is in the three of the figurative sleep , reflecting false , and a few of the right was given to the right right to be sure .\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:44:38,066 - INFO - joeynmt.training - \tHypothesis: Do Not Live to Be Forever\n", "2021-05-08 16:44:38,067 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 11000: bleu: 21.09, loss: 63334.9336, ppl: 8.8150, duration: 29.2587s\n", "2021-05-08 16:44:52,409 - INFO - joeynmt.training - Epoch 2, Step: 11100, Batch Loss: 2.033438, Tokens per Sec: 17891, Lr: 0.000300\n", "2021-05-08 16:45:06,748 - INFO - joeynmt.training - Epoch 2, Step: 11200, Batch Loss: 2.629793, Tokens per Sec: 17892, Lr: 0.000300\n", "2021-05-08 16:45:20,972 - INFO - joeynmt.training - Epoch 2, Step: 11300, Batch Loss: 2.309156, Tokens per Sec: 17641, Lr: 0.000300\n", "2021-05-08 16:45:35,388 - INFO - joeynmt.training - Epoch 2, Step: 11400, Batch Loss: 2.343301, Tokens per Sec: 17733, Lr: 0.000300\n", "2021-05-08 16:45:49,628 - INFO - joeynmt.training - Epoch 2, Step: 11500, Batch Loss: 2.428186, Tokens per Sec: 17490, Lr: 0.000300\n", "2021-05-08 16:46:03,820 - INFO - joeynmt.training - Epoch 2, Step: 11600, Batch Loss: 2.448365, Tokens per Sec: 17561, Lr: 0.000300\n", "2021-05-08 16:46:18,114 - INFO - joeynmt.training - Epoch 2, Step: 11700, Batch Loss: 2.383283, Tokens per Sec: 18027, Lr: 0.000300\n", "2021-05-08 16:46:32,580 - INFO - joeynmt.training - Epoch 2, Step: 11800, Batch Loss: 2.285078, Tokens per Sec: 17850, Lr: 0.000300\n", "2021-05-08 16:46:47,025 - INFO - joeynmt.training - Epoch 2, Step: 11900, Batch Loss: 2.469459, Tokens per Sec: 17740, Lr: 0.000300\n", "2021-05-08 16:47:01,440 - INFO - joeynmt.training - Epoch 2, Step: 12000, Batch Loss: 2.249008, Tokens per Sec: 17897, Lr: 0.000300\n", "2021-05-08 16:47:27,450 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:47:27,451 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:47:27,451 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:47:27,697 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:47:27,697 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:47:28,126 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:47:28,126 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:47:28,126 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:47:28,126 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tHypothesis: How does Jehovah not reach people and why ?\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tHypothesis: The Gospel of John is in the three times of the fearing of the step that was released by the false , and a few few of the same disaster was given to be sure .\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:47:28,127 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:47:28,128 - INFO - joeynmt.training - \tHypothesis: Discessing to Be Happy\n", "2021-05-08 16:47:28,128 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 12000: bleu: 22.31, loss: 61446.6758, ppl: 8.2612, duration: 26.6871s\n", "2021-05-08 16:47:42,331 - INFO - joeynmt.training - Epoch 2, Step: 12100, Batch Loss: 2.212579, Tokens per Sec: 17597, Lr: 0.000300\n", "2021-05-08 16:47:56,665 - INFO - joeynmt.training - Epoch 2, Step: 12200, Batch Loss: 2.193660, Tokens per Sec: 17958, Lr: 0.000300\n", "2021-05-08 16:48:10,990 - INFO - joeynmt.training - Epoch 2, Step: 12300, Batch Loss: 2.298955, Tokens per Sec: 17657, Lr: 0.000300\n", "2021-05-08 16:48:25,068 - INFO - joeynmt.training - Epoch 2, Step: 12400, Batch Loss: 2.220313, Tokens per Sec: 17455, Lr: 0.000300\n", "2021-05-08 16:48:39,396 - INFO - joeynmt.training - Epoch 2, Step: 12500, Batch Loss: 2.097169, Tokens per Sec: 17460, Lr: 0.000300\n", "2021-05-08 16:48:53,534 - INFO - joeynmt.training - Epoch 2, Step: 12600, Batch Loss: 2.443189, Tokens per Sec: 17674, Lr: 0.000300\n", "2021-05-08 16:49:07,854 - INFO - joeynmt.training - Epoch 2, Step: 12700, Batch Loss: 2.590242, Tokens per Sec: 17576, Lr: 0.000300\n", "2021-05-08 16:49:22,265 - INFO - joeynmt.training - Epoch 2, Step: 12800, Batch Loss: 2.402148, Tokens per Sec: 17805, Lr: 0.000300\n", "2021-05-08 16:49:36,741 - INFO - joeynmt.training - Epoch 2, Step: 12900, Batch Loss: 2.348701, Tokens per Sec: 17902, Lr: 0.000300\n", "2021-05-08 16:49:50,907 - INFO - joeynmt.training - Epoch 2, Step: 13000, Batch Loss: 2.501314, Tokens per Sec: 17697, Lr: 0.000300\n", "2021-05-08 16:50:19,173 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:50:19,173 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:50:19,173 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:50:19,420 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:50:19,421 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:50:19,883 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:50:19,883 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:50:19,883 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:50:19,883 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:50:19,883 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tHypothesis: How did Jehovah not want to do with people and why ?\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tHypothesis: The Gospel of John is a third of three sleep , showing false , and a few of the same sound given to the right to be sure .\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:50:19,884 - INFO - joeynmt.training - \tHypothesis: Do Not Give to Be Comfort\n", "2021-05-08 16:50:19,885 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 13000: bleu: 22.61, loss: 60129.8672, ppl: 7.8957, duration: 28.9776s\n", "2021-05-08 16:50:34,088 - INFO - joeynmt.training - Epoch 2, Step: 13100, Batch Loss: 2.149188, Tokens per Sec: 17659, Lr: 0.000300\n", "2021-05-08 16:50:48,334 - INFO - joeynmt.training - Epoch 2, Step: 13200, Batch Loss: 2.246917, Tokens per Sec: 17784, Lr: 0.000300\n", "2021-05-08 16:51:02,486 - INFO - joeynmt.training - Epoch 2, Step: 13300, Batch Loss: 2.178790, Tokens per Sec: 17860, Lr: 0.000300\n", "2021-05-08 16:51:16,797 - INFO - joeynmt.training - Epoch 2, Step: 13400, Batch Loss: 1.861993, Tokens per Sec: 17524, Lr: 0.000300\n", "2021-05-08 16:51:30,979 - INFO - joeynmt.training - Epoch 2, Step: 13500, Batch Loss: 2.171669, Tokens per Sec: 17502, Lr: 0.000300\n", "2021-05-08 16:51:45,292 - INFO - joeynmt.training - Epoch 2, Step: 13600, Batch Loss: 2.341332, Tokens per Sec: 17775, Lr: 0.000300\n", "2021-05-08 16:51:59,718 - INFO - joeynmt.training - Epoch 2, Step: 13700, Batch Loss: 1.950542, Tokens per Sec: 17589, Lr: 0.000300\n", "2021-05-08 16:52:14,150 - INFO - joeynmt.training - Epoch 2, Step: 13800, Batch Loss: 2.175300, Tokens per Sec: 17516, Lr: 0.000300\n", "2021-05-08 16:52:28,430 - INFO - joeynmt.training - Epoch 2, Step: 13900, Batch Loss: 2.691314, Tokens per Sec: 17718, Lr: 0.000300\n", "2021-05-08 16:52:42,711 - INFO - joeynmt.training - Epoch 2, Step: 14000, Batch Loss: 1.923527, Tokens per Sec: 17543, Lr: 0.000300\n", "2021-05-08 16:53:08,656 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:53:08,656 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:53:08,656 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:53:08,901 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:53:08,902 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:53:09,317 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:53:09,317 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:53:09,317 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:53:09,317 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:53:09,317 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:53:09,317 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:53:09,317 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - \tHypothesis: How did Jehovah not react with people and why ?\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - \tHypothesis: The Gospel of John is the third of three hours were sent by the stream , demonstrating false , and a few few were given to the right of truth .\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - \tHypothesis: Do Not Love to Be Happy\n", "2021-05-08 16:53:09,318 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 14000: bleu: 24.27, loss: 58738.9102, ppl: 7.5272, duration: 26.6067s\n", "2021-05-08 16:53:23,741 - INFO - joeynmt.training - Epoch 2, Step: 14100, Batch Loss: 2.254124, Tokens per Sec: 17686, Lr: 0.000300\n", "2021-05-08 16:53:38,013 - INFO - joeynmt.training - Epoch 2, Step: 14200, Batch Loss: 2.326190, Tokens per Sec: 17583, Lr: 0.000300\n", "2021-05-08 16:53:52,262 - INFO - joeynmt.training - Epoch 2, Step: 14300, Batch Loss: 2.239329, Tokens per Sec: 17502, Lr: 0.000300\n", "2021-05-08 16:54:06,761 - INFO - joeynmt.training - Epoch 2, Step: 14400, Batch Loss: 2.277990, Tokens per Sec: 17654, Lr: 0.000300\n", "2021-05-08 16:54:21,176 - INFO - joeynmt.training - Epoch 2, Step: 14500, Batch Loss: 2.233241, Tokens per Sec: 17749, Lr: 0.000300\n", "2021-05-08 16:54:35,615 - INFO - joeynmt.training - Epoch 2, Step: 14600, Batch Loss: 2.212970, Tokens per Sec: 17847, Lr: 0.000300\n", "2021-05-08 16:54:49,932 - INFO - joeynmt.training - Epoch 2, Step: 14700, Batch Loss: 2.111558, Tokens per Sec: 17485, Lr: 0.000300\n", "2021-05-08 16:55:04,261 - INFO - joeynmt.training - Epoch 2, Step: 14800, Batch Loss: 2.341433, Tokens per Sec: 17704, Lr: 0.000300\n", "2021-05-08 16:55:18,528 - INFO - joeynmt.training - Epoch 2, Step: 14900, Batch Loss: 2.282255, Tokens per Sec: 17687, Lr: 0.000300\n", "2021-05-08 16:55:32,801 - INFO - joeynmt.training - Epoch 2, Step: 15000, Batch Loss: 2.221808, Tokens per Sec: 17526, Lr: 0.000300\n", "2021-05-08 16:55:58,039 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:55:58,039 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:55:58,039 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:55:58,284 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:55:58,284 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:55:58,682 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:55:58,682 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:55:58,682 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:55:58,682 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:55:58,682 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tHypothesis: How did Jehovah reach people and why ?\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tHypothesis: The Gospel of John is outside three silver were sent by the stream , showing false , and few few few seven sounds were given to the right of the right to be sure .\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:55:58,683 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:55:58,684 - INFO - joeynmt.training - \tHypothesis: Do Not Learn to Be Comperfect\n", "2021-05-08 16:55:58,684 - INFO - joeynmt.training - Validation result (greedy) at epoch 2, step 15000: bleu: 24.77, loss: 57507.4336, ppl: 7.2153, duration: 25.8823s\n", "2021-05-08 16:56:13,197 - INFO - joeynmt.training - Epoch 2, Step: 15100, Batch Loss: 2.416464, Tokens per Sec: 17902, Lr: 0.000300\n", "2021-05-08 16:56:27,265 - INFO - joeynmt.training - Epoch 2, Step: 15200, Batch Loss: 2.041500, Tokens per Sec: 17411, Lr: 0.000300\n", "2021-05-08 16:56:41,621 - INFO - joeynmt.training - Epoch 2, Step: 15300, Batch Loss: 2.818659, Tokens per Sec: 17679, Lr: 0.000300\n", "2021-05-08 16:56:55,815 - INFO - joeynmt.training - Epoch 2, Step: 15400, Batch Loss: 2.150504, Tokens per Sec: 17497, Lr: 0.000300\n", "2021-05-08 16:57:10,049 - INFO - joeynmt.training - Epoch 2, Step: 15500, Batch Loss: 2.423874, Tokens per Sec: 17591, Lr: 0.000300\n", "2021-05-08 16:57:21,521 - INFO - joeynmt.training - Epoch 2: total training loss 18911.65\n", "2021-05-08 16:57:21,521 - INFO - joeynmt.training - EPOCH 3\n", "2021-05-08 16:57:25,380 - INFO - joeynmt.training - Epoch 3, Step: 15600, Batch Loss: 2.520654, Tokens per Sec: 12647, Lr: 0.000300\n", "2021-05-08 16:57:39,607 - INFO - joeynmt.training - Epoch 3, Step: 15700, Batch Loss: 2.102774, Tokens per Sec: 17686, Lr: 0.000300\n", "2021-05-08 16:57:53,897 - INFO - joeynmt.training - Epoch 3, Step: 15800, Batch Loss: 2.435557, Tokens per Sec: 17641, Lr: 0.000300\n", "2021-05-08 16:58:08,284 - INFO - joeynmt.training - Epoch 3, Step: 15900, Batch Loss: 2.256947, Tokens per Sec: 17732, Lr: 0.000300\n", "2021-05-08 16:58:22,582 - INFO - joeynmt.training - Epoch 3, Step: 16000, Batch Loss: 1.941338, Tokens per Sec: 17457, Lr: 0.000300\n", "2021-05-08 16:58:47,698 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 16:58:47,698 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 16:58:47,698 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 16:58:47,947 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 16:58:47,948 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - Example #0\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - Example #1\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rejoice with people and why ?\n", "2021-05-08 16:58:48,373 - INFO - joeynmt.training - Example #2\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - \tHypothesis: The Gospel of John outside three barries were moved by the stream , displaying false , and few few have been given to the right of unsure .\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - Example #3\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - \tHypothesis: Do Not Have a Spirit\n", "2021-05-08 16:58:48,374 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 16000: bleu: 25.17, loss: 56698.9102, ppl: 7.0176, duration: 25.7915s\n", "2021-05-08 16:59:02,624 - INFO - joeynmt.training - Epoch 3, Step: 16100, Batch Loss: 2.124176, Tokens per Sec: 17623, Lr: 0.000300\n", "2021-05-08 16:59:16,864 - INFO - joeynmt.training - Epoch 3, Step: 16200, Batch Loss: 2.098275, Tokens per Sec: 17799, Lr: 0.000300\n", "2021-05-08 16:59:31,354 - INFO - joeynmt.training - Epoch 3, Step: 16300, Batch Loss: 2.390093, Tokens per Sec: 17702, Lr: 0.000300\n", "2021-05-08 16:59:45,831 - INFO - joeynmt.training - Epoch 3, Step: 16400, Batch Loss: 2.225934, Tokens per Sec: 17892, Lr: 0.000300\n", "2021-05-08 17:00:00,023 - INFO - joeynmt.training - Epoch 3, Step: 16500, Batch Loss: 1.868576, Tokens per Sec: 17543, Lr: 0.000300\n", "2021-05-08 17:00:14,349 - INFO - joeynmt.training - Epoch 3, Step: 16600, Batch Loss: 2.026117, Tokens per Sec: 17687, Lr: 0.000300\n", "2021-05-08 17:00:28,827 - INFO - joeynmt.training - Epoch 3, Step: 16700, Batch Loss: 2.331259, Tokens per Sec: 17746, Lr: 0.000300\n", "2021-05-08 17:00:43,218 - INFO - joeynmt.training - Epoch 3, Step: 16800, Batch Loss: 2.318969, Tokens per Sec: 17626, Lr: 0.000300\n", "2021-05-08 17:00:57,535 - INFO - joeynmt.training - Epoch 3, Step: 16900, Batch Loss: 2.661973, Tokens per Sec: 17504, Lr: 0.000300\n", "2021-05-08 17:01:11,773 - INFO - joeynmt.training - Epoch 3, Step: 17000, Batch Loss: 2.432710, Tokens per Sec: 17890, Lr: 0.000300\n", "2021-05-08 17:01:36,931 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:01:36,931 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:01:36,932 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:01:37,176 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:01:37,177 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:01:37,609 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:01:37,609 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:01:37,609 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:01:37,609 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:01:37,609 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rejoice people and why ?\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tHypothesis: The Gospel of John outside three silver was sent by the stream , displaying false , and few few have been given to the right of being sure .\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:01:37,610 - INFO - joeynmt.training - \tHypothesis: Do Not Give to Be perfect\n", "2021-05-08 17:01:37,611 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 17000: bleu: 26.10, loss: 55670.3047, ppl: 6.7738, duration: 25.8370s\n", "2021-05-08 17:01:52,161 - INFO - joeynmt.training - Epoch 3, Step: 17100, Batch Loss: 2.353511, Tokens per Sec: 17772, Lr: 0.000300\n", "2021-05-08 17:02:06,661 - INFO - joeynmt.training - Epoch 3, Step: 17200, Batch Loss: 2.626467, Tokens per Sec: 17828, Lr: 0.000300\n", "2021-05-08 17:02:20,866 - INFO - joeynmt.training - Epoch 3, Step: 17300, Batch Loss: 1.961693, Tokens per Sec: 17620, Lr: 0.000300\n", "2021-05-08 17:02:35,336 - INFO - joeynmt.training - Epoch 3, Step: 17400, Batch Loss: 2.265930, Tokens per Sec: 17552, Lr: 0.000300\n", "2021-05-08 17:02:49,738 - INFO - joeynmt.training - Epoch 3, Step: 17500, Batch Loss: 2.262965, Tokens per Sec: 17405, Lr: 0.000300\n", "2021-05-08 17:03:03,910 - INFO - joeynmt.training - Epoch 3, Step: 17600, Batch Loss: 2.323813, Tokens per Sec: 17503, Lr: 0.000300\n", "2021-05-08 17:03:18,270 - INFO - joeynmt.training - Epoch 3, Step: 17700, Batch Loss: 2.068549, Tokens per Sec: 17616, Lr: 0.000300\n", "2021-05-08 17:03:32,541 - INFO - joeynmt.training - Epoch 3, Step: 17800, Batch Loss: 2.333526, Tokens per Sec: 17634, Lr: 0.000300\n", "2021-05-08 17:03:46,899 - INFO - joeynmt.training - Epoch 3, Step: 17900, Batch Loss: 2.206916, Tokens per Sec: 17881, Lr: 0.000300\n", "2021-05-08 17:04:01,087 - INFO - joeynmt.training - Epoch 3, Step: 18000, Batch Loss: 2.059303, Tokens per Sec: 17606, Lr: 0.000300\n", "2021-05-08 17:04:26,429 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:04:26,430 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:04:26,430 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:04:26,675 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:04:26,675 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:04:27,084 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:04:27,084 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rejoice people and why ?\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - \tHypothesis: All the Gospel of John outside three silver was offered by the steadfast , reflecting false , and few few silver were given to the right of uncertainty .\n", "2021-05-08 17:04:27,085 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:04:27,086 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:04:27,086 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:04:27,086 - INFO - joeynmt.training - \tHypothesis: Do Not Love to Be perfect\n", "2021-05-08 17:04:27,086 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 18000: bleu: 27.06, loss: 54687.6680, ppl: 6.5489, duration: 25.9987s\n", "2021-05-08 17:04:41,366 - INFO - joeynmt.training - Epoch 3, Step: 18100, Batch Loss: 2.013809, Tokens per Sec: 17614, Lr: 0.000300\n", "2021-05-08 17:04:55,766 - INFO - joeynmt.training - Epoch 3, Step: 18200, Batch Loss: 2.091684, Tokens per Sec: 17834, Lr: 0.000300\n", "2021-05-08 17:05:10,065 - INFO - joeynmt.training - Epoch 3, Step: 18300, Batch Loss: 2.207168, Tokens per Sec: 17679, Lr: 0.000300\n", "2021-05-08 17:05:24,305 - INFO - joeynmt.training - Epoch 3, Step: 18400, Batch Loss: 2.002637, Tokens per Sec: 17744, Lr: 0.000300\n", "2021-05-08 17:05:38,624 - INFO - joeynmt.training - Epoch 3, Step: 18500, Batch Loss: 2.102579, Tokens per Sec: 17778, Lr: 0.000300\n", "2021-05-08 17:05:52,988 - INFO - joeynmt.training - Epoch 3, Step: 18600, Batch Loss: 2.309765, Tokens per Sec: 17895, Lr: 0.000300\n", "2021-05-08 17:06:07,408 - INFO - joeynmt.training - Epoch 3, Step: 18700, Batch Loss: 2.170875, Tokens per Sec: 17889, Lr: 0.000300\n", "2021-05-08 17:06:21,585 - INFO - joeynmt.training - Epoch 3, Step: 18800, Batch Loss: 2.104536, Tokens per Sec: 17744, Lr: 0.000300\n", "2021-05-08 17:06:35,906 - INFO - joeynmt.training - Epoch 3, Step: 18900, Batch Loss: 2.070079, Tokens per Sec: 17688, Lr: 0.000300\n", "2021-05-08 17:06:50,171 - INFO - joeynmt.training - Epoch 3, Step: 19000, Batch Loss: 2.082397, Tokens per Sec: 17565, Lr: 0.000300\n", "2021-05-08 17:07:11,321 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:07:11,321 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:07:11,321 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:07:11,563 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:07:11,563 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:07:11,972 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rely resist people and why ?\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - \tHypothesis: The Gospel of John outside three sleep was given by the stream , displaying false , and a few of the liberated mate was given to the right .\n", "2021-05-08 17:07:11,973 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:07:11,974 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:07:11,974 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:07:11,974 - INFO - joeynmt.training - \tHypothesis: Do Not Learn to Be perfect\n", "2021-05-08 17:07:11,974 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 19000: bleu: 27.07, loss: 54047.3242, ppl: 6.4064, duration: 21.8027s\n", "2021-05-08 17:07:26,129 - INFO - joeynmt.training - Epoch 3, Step: 19100, Batch Loss: 2.114681, Tokens per Sec: 17593, Lr: 0.000300\n", "2021-05-08 17:07:40,263 - INFO - joeynmt.training - Epoch 3, Step: 19200, Batch Loss: 2.372374, Tokens per Sec: 17662, Lr: 0.000300\n", "2021-05-08 17:07:54,524 - INFO - joeynmt.training - Epoch 3, Step: 19300, Batch Loss: 2.165650, Tokens per Sec: 17662, Lr: 0.000300\n", "2021-05-08 17:08:08,925 - INFO - joeynmt.training - Epoch 3, Step: 19400, Batch Loss: 2.239213, Tokens per Sec: 17994, Lr: 0.000300\n", "2021-05-08 17:08:23,098 - INFO - joeynmt.training - Epoch 3, Step: 19500, Batch Loss: 1.737861, Tokens per Sec: 17475, Lr: 0.000300\n", "2021-05-08 17:08:37,364 - INFO - joeynmt.training - Epoch 3, Step: 19600, Batch Loss: 2.106302, Tokens per Sec: 17874, Lr: 0.000300\n", "2021-05-08 17:08:51,577 - INFO - joeynmt.training - Epoch 3, Step: 19700, Batch Loss: 1.961150, Tokens per Sec: 17621, Lr: 0.000300\n", "2021-05-08 17:09:05,698 - INFO - joeynmt.training - Epoch 3, Step: 19800, Batch Loss: 1.877052, Tokens per Sec: 17693, Lr: 0.000300\n", "2021-05-08 17:09:19,807 - INFO - joeynmt.training - Epoch 3, Step: 19900, Batch Loss: 2.187069, Tokens per Sec: 17526, Lr: 0.000300\n", "2021-05-08 17:09:33,896 - INFO - joeynmt.training - Epoch 3, Step: 20000, Batch Loss: 2.085874, Tokens per Sec: 17812, Lr: 0.000300\n", "2021-05-08 17:09:59,528 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:09:59,528 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:09:59,528 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:09:59,772 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:09:59,773 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:10:00,208 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:10:00,208 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:10:00,208 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:10:00,208 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:10:00,208 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:10:00,208 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rely react with people ?\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tHypothesis: The Gospel of John outside three sleep was supposed by the lying of the lying , demonstrating false , and few few of the label was given to be sure .\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - \tHypothesis: Do Not Have a Spirit\n", "2021-05-08 17:10:00,209 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 20000: bleu: 27.32, loss: 53140.5586, ppl: 6.2098, duration: 26.3129s\n", "2021-05-08 17:10:14,412 - INFO - joeynmt.training - Epoch 3, Step: 20100, Batch Loss: 2.050168, Tokens per Sec: 17920, Lr: 0.000300\n", "2021-05-08 17:10:28,668 - INFO - joeynmt.training - Epoch 3, Step: 20200, Batch Loss: 2.100239, Tokens per Sec: 17806, Lr: 0.000300\n", "2021-05-08 17:10:43,001 - INFO - joeynmt.training - Epoch 3, Step: 20300, Batch Loss: 2.253396, Tokens per Sec: 17748, Lr: 0.000300\n", "2021-05-08 17:10:57,231 - INFO - joeynmt.training - Epoch 3, Step: 20400, Batch Loss: 2.047913, Tokens per Sec: 17592, Lr: 0.000300\n", "2021-05-08 17:11:11,447 - INFO - joeynmt.training - Epoch 3, Step: 20500, Batch Loss: 2.074579, Tokens per Sec: 17886, Lr: 0.000300\n", "2021-05-08 17:11:25,654 - INFO - joeynmt.training - Epoch 3, Step: 20600, Batch Loss: 2.040797, Tokens per Sec: 17824, Lr: 0.000300\n", "2021-05-08 17:11:40,056 - INFO - joeynmt.training - Epoch 3, Step: 20700, Batch Loss: 2.133159, Tokens per Sec: 17815, Lr: 0.000300\n", "2021-05-08 17:11:54,258 - INFO - joeynmt.training - Epoch 3, Step: 20800, Batch Loss: 2.446002, Tokens per Sec: 17981, Lr: 0.000300\n", "2021-05-08 17:12:08,608 - INFO - joeynmt.training - Epoch 3, Step: 20900, Batch Loss: 1.934557, Tokens per Sec: 17795, Lr: 0.000300\n", "2021-05-08 17:12:22,923 - INFO - joeynmt.training - Epoch 3, Step: 21000, Batch Loss: 1.976393, Tokens per Sec: 17926, Lr: 0.000300\n", "2021-05-08 17:12:47,462 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:12:47,462 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:12:47,462 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:12:47,707 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:12:47,708 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:12:48,094 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:12:48,094 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:12:48,094 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:12:48,094 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:12:48,094 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tHypothesis: How did Jehovah react with people and why ?\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tHypothesis: The whole Gospel of John outside three silver was laid by the step , reflecting false , and few of the storm was given to the right of unsure .\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:12:48,095 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:12:48,096 - INFO - joeynmt.training - \tHypothesis: Do Not Give to Be Have Having\n", "2021-05-08 17:12:48,096 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 21000: bleu: 27.92, loss: 52137.6797, ppl: 5.9995, duration: 25.1723s\n", "2021-05-08 17:13:02,488 - INFO - joeynmt.training - Epoch 3, Step: 21100, Batch Loss: 1.950938, Tokens per Sec: 18006, Lr: 0.000300\n", "2021-05-08 17:13:16,819 - INFO - joeynmt.training - Epoch 3, Step: 21200, Batch Loss: 1.848081, Tokens per Sec: 17896, Lr: 0.000300\n", "2021-05-08 17:13:30,962 - INFO - joeynmt.training - Epoch 3, Step: 21300, Batch Loss: 1.924345, Tokens per Sec: 17734, Lr: 0.000300\n", "2021-05-08 17:13:45,130 - INFO - joeynmt.training - Epoch 3, Step: 21400, Batch Loss: 2.135876, Tokens per Sec: 17617, Lr: 0.000300\n", "2021-05-08 17:13:59,281 - INFO - joeynmt.training - Epoch 3, Step: 21500, Batch Loss: 2.527992, Tokens per Sec: 17524, Lr: 0.000300\n", "2021-05-08 17:14:13,468 - INFO - joeynmt.training - Epoch 3, Step: 21600, Batch Loss: 2.255540, Tokens per Sec: 17721, Lr: 0.000300\n", "2021-05-08 17:14:27,714 - INFO - joeynmt.training - Epoch 3, Step: 21700, Batch Loss: 1.879838, Tokens per Sec: 17657, Lr: 0.000300\n", "2021-05-08 17:14:41,958 - INFO - joeynmt.training - Epoch 3, Step: 21800, Batch Loss: 2.557914, Tokens per Sec: 17825, Lr: 0.000300\n", "2021-05-08 17:14:56,363 - INFO - joeynmt.training - Epoch 3, Step: 21900, Batch Loss: 1.994764, Tokens per Sec: 17999, Lr: 0.000300\n", "2021-05-08 17:15:10,757 - INFO - joeynmt.training - Epoch 3, Step: 22000, Batch Loss: 1.643145, Tokens per Sec: 17829, Lr: 0.000300\n", "2021-05-08 17:15:36,058 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:15:36,058 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:15:36,058 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:15:36,312 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:15:36,312 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rely seek people and why ?\n", "2021-05-08 17:15:36,724 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - \tHypothesis: All the Gospel of John outside three silver was offered by the step , displaying false , and a few - style was given to the foundation of uncertainty .\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Have\n", "2021-05-08 17:15:36,725 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 22000: bleu: 28.66, loss: 51759.2344, ppl: 5.9220, duration: 25.9678s\n", "2021-05-08 17:15:51,177 - INFO - joeynmt.training - Epoch 3, Step: 22100, Batch Loss: 2.131945, Tokens per Sec: 18025, Lr: 0.000300\n", "2021-05-08 17:16:05,388 - INFO - joeynmt.training - Epoch 3, Step: 22200, Batch Loss: 1.867278, Tokens per Sec: 17507, Lr: 0.000300\n", "2021-05-08 17:16:19,699 - INFO - joeynmt.training - Epoch 3, Step: 22300, Batch Loss: 2.033764, Tokens per Sec: 17805, Lr: 0.000300\n", "2021-05-08 17:16:33,870 - INFO - joeynmt.training - Epoch 3, Step: 22400, Batch Loss: 2.091750, Tokens per Sec: 17517, Lr: 0.000300\n", "2021-05-08 17:16:48,070 - INFO - joeynmt.training - Epoch 3, Step: 22500, Batch Loss: 2.565732, Tokens per Sec: 17548, Lr: 0.000300\n", "2021-05-08 17:17:02,193 - INFO - joeynmt.training - Epoch 3, Step: 22600, Batch Loss: 1.921049, Tokens per Sec: 17547, Lr: 0.000300\n", "2021-05-08 17:17:16,355 - INFO - joeynmt.training - Epoch 3, Step: 22700, Batch Loss: 2.096287, Tokens per Sec: 17437, Lr: 0.000300\n", "2021-05-08 17:17:30,654 - INFO - joeynmt.training - Epoch 3, Step: 22800, Batch Loss: 1.944853, Tokens per Sec: 17625, Lr: 0.000300\n", "2021-05-08 17:17:44,953 - INFO - joeynmt.training - Epoch 3, Step: 22900, Batch Loss: 1.755049, Tokens per Sec: 17612, Lr: 0.000300\n", "2021-05-08 17:17:59,274 - INFO - joeynmt.training - Epoch 3, Step: 23000, Batch Loss: 2.072597, Tokens per Sec: 17851, Lr: 0.000300\n", "2021-05-08 17:18:25,588 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:18:25,588 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:18:25,588 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:18:25,844 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:18:25,844 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:18:26,283 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rely on people and why ?\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - \tHypothesis: The whole Gospel of John outside the three silver was supposed by the stream , reflecting false , and few bittered malarrival was given to the right of uncertainty .\n", "2021-05-08 17:18:26,284 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:18:26,285 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:18:26,285 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:18:26,285 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Human\n", "2021-05-08 17:18:26,285 - INFO - joeynmt.training - Validation result (greedy) at epoch 3, step 23000: bleu: 28.60, loss: 51126.4688, ppl: 5.7946, duration: 27.0107s\n", "2021-05-08 17:18:40,535 - INFO - joeynmt.training - Epoch 3, Step: 23100, Batch Loss: 2.021381, Tokens per Sec: 17560, Lr: 0.000300\n", "2021-05-08 17:18:54,803 - INFO - joeynmt.training - Epoch 3, Step: 23200, Batch Loss: 1.758883, Tokens per Sec: 17726, Lr: 0.000300\n", "2021-05-08 17:19:09,164 - INFO - joeynmt.training - Epoch 3, Step: 23300, Batch Loss: 1.967932, Tokens per Sec: 17607, Lr: 0.000300\n", "2021-05-08 17:19:18,409 - INFO - joeynmt.training - Epoch 3: total training loss 16456.09\n", "2021-05-08 17:19:18,409 - INFO - joeynmt.training - EPOCH 4\n", "2021-05-08 17:19:24,304 - INFO - joeynmt.training - Epoch 4, Step: 23400, Batch Loss: 1.799245, Tokens per Sec: 14950, Lr: 0.000300\n", "2021-05-08 17:19:38,613 - INFO - joeynmt.training - Epoch 4, Step: 23500, Batch Loss: 1.973743, Tokens per Sec: 17866, Lr: 0.000300\n", "2021-05-08 17:19:52,758 - INFO - joeynmt.training - Epoch 4, Step: 23600, Batch Loss: 1.860081, Tokens per Sec: 17533, Lr: 0.000300\n", "2021-05-08 17:20:06,986 - INFO - joeynmt.training - Epoch 4, Step: 23700, Batch Loss: 1.896733, Tokens per Sec: 17641, Lr: 0.000300\n", "2021-05-08 17:20:21,230 - INFO - joeynmt.training - Epoch 4, Step: 23800, Batch Loss: 2.092553, Tokens per Sec: 17748, Lr: 0.000300\n", "2021-05-08 17:20:35,441 - INFO - joeynmt.training - Epoch 4, Step: 23900, Batch Loss: 2.109105, Tokens per Sec: 17702, Lr: 0.000300\n", "2021-05-08 17:20:49,796 - INFO - joeynmt.training - Epoch 4, Step: 24000, Batch Loss: 2.005659, Tokens per Sec: 17800, Lr: 0.000300\n", "2021-05-08 17:21:15,001 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:21:15,001 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:21:15,001 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:21:15,244 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:21:15,245 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:21:15,654 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rely for people and why ?\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - \tHypothesis: All the Gospel of John outside the three silver were supplied by the stream , showing false , and few formed merchants were given to the frightening fight .\n", "2021-05-08 17:21:15,655 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:21:15,656 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:21:15,656 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:21:15,656 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Have\n", "2021-05-08 17:21:15,656 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 24000: bleu: 28.94, loss: 50882.1367, ppl: 5.7461, duration: 25.8598s\n", "2021-05-08 17:21:29,889 - INFO - joeynmt.training - Epoch 4, Step: 24100, Batch Loss: 2.098649, Tokens per Sec: 17811, Lr: 0.000300\n", "2021-05-08 17:21:44,170 - INFO - joeynmt.training - Epoch 4, Step: 24200, Batch Loss: 2.272456, Tokens per Sec: 17580, Lr: 0.000300\n", "2021-05-08 17:21:58,392 - INFO - joeynmt.training - Epoch 4, Step: 24300, Batch Loss: 2.026934, Tokens per Sec: 17676, Lr: 0.000300\n", "2021-05-08 17:22:12,641 - INFO - joeynmt.training - Epoch 4, Step: 24400, Batch Loss: 1.555852, Tokens per Sec: 17783, Lr: 0.000300\n", "2021-05-08 17:22:26,894 - INFO - joeynmt.training - Epoch 4, Step: 24500, Batch Loss: 1.782912, Tokens per Sec: 17753, Lr: 0.000300\n", "2021-05-08 17:22:41,072 - INFO - joeynmt.training - Epoch 4, Step: 24600, Batch Loss: 1.842562, Tokens per Sec: 17661, Lr: 0.000300\n", "2021-05-08 17:22:55,253 - INFO - joeynmt.training - Epoch 4, Step: 24700, Batch Loss: 1.818963, Tokens per Sec: 17751, Lr: 0.000300\n", "2021-05-08 17:23:09,492 - INFO - joeynmt.training - Epoch 4, Step: 24800, Batch Loss: 1.994560, Tokens per Sec: 17678, Lr: 0.000300\n", "2021-05-08 17:23:23,788 - INFO - joeynmt.training - Epoch 4, Step: 24900, Batch Loss: 2.002143, Tokens per Sec: 17661, Lr: 0.000300\n", "2021-05-08 17:23:38,078 - INFO - joeynmt.training - Epoch 4, Step: 25000, Batch Loss: 1.799729, Tokens per Sec: 17714, Lr: 0.000300\n", "2021-05-08 17:24:02,508 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:24:02,508 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:24:02,509 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:24:02,758 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:24:02,758 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - \tHypothesis: How did Jehovah rely resist people and why ?\n", "2021-05-08 17:24:03,271 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - \tHypothesis: All John’s Gospel outside the three silver was supposed by the weight , displaying false , and a few silver was given to a sort of sight .\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - \tHypothesis: Do Not Make Human\n", "2021-05-08 17:24:03,272 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 25000: bleu: 29.36, loss: 50395.8047, ppl: 5.6509, duration: 25.1944s\n", "2021-05-08 17:24:17,423 - INFO - joeynmt.training - Epoch 4, Step: 25100, Batch Loss: 2.230082, Tokens per Sec: 17645, Lr: 0.000300\n", "2021-05-08 17:24:31,611 - INFO - joeynmt.training - Epoch 4, Step: 25200, Batch Loss: 1.948971, Tokens per Sec: 17618, Lr: 0.000300\n", "2021-05-08 17:24:45,695 - INFO - joeynmt.training - Epoch 4, Step: 25300, Batch Loss: 1.742066, Tokens per Sec: 17773, Lr: 0.000300\n", "2021-05-08 17:24:59,903 - INFO - joeynmt.training - Epoch 4, Step: 25400, Batch Loss: 1.962237, Tokens per Sec: 17624, Lr: 0.000300\n", "2021-05-08 17:25:14,139 - INFO - joeynmt.training - Epoch 4, Step: 25500, Batch Loss: 1.892708, Tokens per Sec: 17524, Lr: 0.000300\n", "2021-05-08 17:25:28,360 - INFO - joeynmt.training - Epoch 4, Step: 25600, Batch Loss: 2.096745, Tokens per Sec: 17748, Lr: 0.000300\n", "2021-05-08 17:25:42,650 - INFO - joeynmt.training - Epoch 4, Step: 25700, Batch Loss: 1.898890, Tokens per Sec: 17862, Lr: 0.000300\n", "2021-05-08 17:25:56,874 - INFO - joeynmt.training - Epoch 4, Step: 25800, Batch Loss: 2.110819, Tokens per Sec: 17858, Lr: 0.000300\n", "2021-05-08 17:26:10,956 - INFO - joeynmt.training - Epoch 4, Step: 25900, Batch Loss: 1.824573, Tokens per Sec: 17662, Lr: 0.000300\n", "2021-05-08 17:26:25,215 - INFO - joeynmt.training - Epoch 4, Step: 26000, Batch Loss: 2.002363, Tokens per Sec: 17809, Lr: 0.000300\n", "2021-05-08 17:26:49,035 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:26:49,035 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:26:49,035 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:26:49,277 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:26:49,278 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:26:49,662 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - \tHypothesis: How did Jehovah seek people and why ?\n", "2021-05-08 17:26:49,663 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - \tHypothesis: All John’s Gospel outside three sleep was supposed by the lie , indicating false , and a few were given a type of uncertainty .\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Have of Human\n", "2021-05-08 17:26:49,664 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 26000: bleu: 29.73, loss: 49897.9219, ppl: 5.5550, duration: 24.4490s\n", "2021-05-08 17:27:03,782 - INFO - joeynmt.training - Epoch 4, Step: 26100, Batch Loss: 1.853441, Tokens per Sec: 17621, Lr: 0.000300\n", "2021-05-08 17:27:18,259 - INFO - joeynmt.training - Epoch 4, Step: 26200, Batch Loss: 2.071042, Tokens per Sec: 17921, Lr: 0.000300\n", "2021-05-08 17:27:32,487 - INFO - joeynmt.training - Epoch 4, Step: 26300, Batch Loss: 2.013139, Tokens per Sec: 17553, Lr: 0.000300\n", "2021-05-08 17:27:46,707 - INFO - joeynmt.training - Epoch 4, Step: 26400, Batch Loss: 1.729774, Tokens per Sec: 17631, Lr: 0.000300\n", "2021-05-08 17:28:00,940 - INFO - joeynmt.training - Epoch 4, Step: 26500, Batch Loss: 1.936514, Tokens per Sec: 17814, Lr: 0.000300\n", "2021-05-08 17:28:15,293 - INFO - joeynmt.training - Epoch 4, Step: 26600, Batch Loss: 2.073824, Tokens per Sec: 17694, Lr: 0.000300\n", "2021-05-08 17:28:29,588 - INFO - joeynmt.training - Epoch 4, Step: 26700, Batch Loss: 1.859954, Tokens per Sec: 17870, Lr: 0.000300\n", "2021-05-08 17:28:43,773 - INFO - joeynmt.training - Epoch 4, Step: 26800, Batch Loss: 1.872650, Tokens per Sec: 17550, Lr: 0.000300\n", "2021-05-08 17:28:57,948 - INFO - joeynmt.training - Epoch 4, Step: 26900, Batch Loss: 1.981703, Tokens per Sec: 17671, Lr: 0.000300\n", "2021-05-08 17:29:12,237 - INFO - joeynmt.training - Epoch 4, Step: 27000, Batch Loss: 1.687853, Tokens per Sec: 17681, Lr: 0.000300\n", "2021-05-08 17:29:39,012 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:29:39,013 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:29:39,013 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:29:39,257 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:29:39,257 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:29:39,658 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - \tHypothesis: How did Jehovah resolve people and why ?\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:29:39,659 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:29:39,660 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:29:39,660 - INFO - joeynmt.training - \tHypothesis: All John the Gospel of John outside three sleep was supposed by the lying , indicating false , and a few formed merchant was given to the right of uncertainty .\n", "2021-05-08 17:29:39,660 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:29:39,660 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:29:39,660 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:29:39,660 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Hope\n", "2021-05-08 17:29:39,660 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 27000: bleu: 30.09, loss: 49471.6641, ppl: 5.4743, duration: 27.4228s\n", "2021-05-08 17:29:53,845 - INFO - joeynmt.training - Epoch 4, Step: 27100, Batch Loss: 2.094930, Tokens per Sec: 17762, Lr: 0.000300\n", "2021-05-08 17:30:08,271 - INFO - joeynmt.training - Epoch 4, Step: 27200, Batch Loss: 1.790701, Tokens per Sec: 17908, Lr: 0.000300\n", "2021-05-08 17:30:22,534 - INFO - joeynmt.training - Epoch 4, Step: 27300, Batch Loss: 2.409980, Tokens per Sec: 17640, Lr: 0.000300\n", "2021-05-08 17:30:36,959 - INFO - joeynmt.training - Epoch 4, Step: 27400, Batch Loss: 1.741732, Tokens per Sec: 17998, Lr: 0.000300\n", "2021-05-08 17:30:51,172 - INFO - joeynmt.training - Epoch 4, Step: 27500, Batch Loss: 2.182985, Tokens per Sec: 17523, Lr: 0.000300\n", "2021-05-08 17:31:05,685 - INFO - joeynmt.training - Epoch 4, Step: 27600, Batch Loss: 1.914556, Tokens per Sec: 18074, Lr: 0.000300\n", "2021-05-08 17:31:19,909 - INFO - joeynmt.training - Epoch 4, Step: 27700, Batch Loss: 1.997192, Tokens per Sec: 17509, Lr: 0.000300\n", "2021-05-08 17:31:34,184 - INFO - joeynmt.training - Epoch 4, Step: 27800, Batch Loss: 1.767591, Tokens per Sec: 17870, Lr: 0.000300\n", "2021-05-08 17:31:48,450 - INFO - joeynmt.training - Epoch 4, Step: 27900, Batch Loss: 1.815624, Tokens per Sec: 17703, Lr: 0.000300\n", "2021-05-08 17:32:02,600 - INFO - joeynmt.training - Epoch 4, Step: 28000, Batch Loss: 1.703310, Tokens per Sec: 17670, Lr: 0.000300\n", "2021-05-08 17:32:27,982 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:32:27,982 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:32:27,983 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:32:28,225 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:32:28,225 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:32:28,622 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tHypothesis: How did Jehovah resist people and why ?\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:32:28,623 - INFO - joeynmt.training - \tHypothesis: All John’s Gospels outside three silver were lied by lying , demonstrating false , and a few sleep was given to the frightening of uncertainty .\n", "2021-05-08 17:32:28,624 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:32:28,624 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:32:28,624 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:32:28,624 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Light\n", "2021-05-08 17:32:28,624 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 28000: bleu: 29.90, loss: 48979.0039, ppl: 5.3824, duration: 26.0236s\n", "2021-05-08 17:32:42,793 - INFO - joeynmt.training - Epoch 4, Step: 28100, Batch Loss: 1.953269, Tokens per Sec: 17448, Lr: 0.000300\n", "2021-05-08 17:32:57,046 - INFO - joeynmt.training - Epoch 4, Step: 28200, Batch Loss: 1.810444, Tokens per Sec: 17981, Lr: 0.000300\n", "2021-05-08 17:33:11,414 - INFO - joeynmt.training - Epoch 4, Step: 28300, Batch Loss: 1.803471, Tokens per Sec: 17919, Lr: 0.000300\n", "2021-05-08 17:33:25,749 - INFO - joeynmt.training - Epoch 4, Step: 28400, Batch Loss: 1.854260, Tokens per Sec: 17878, Lr: 0.000300\n", "2021-05-08 17:33:39,897 - INFO - joeynmt.training - Epoch 4, Step: 28500, Batch Loss: 2.096393, Tokens per Sec: 17551, Lr: 0.000300\n", "2021-05-08 17:33:53,997 - INFO - joeynmt.training - Epoch 4, Step: 28600, Batch Loss: 1.812303, Tokens per Sec: 17617, Lr: 0.000300\n", "2021-05-08 17:34:08,198 - INFO - joeynmt.training - Epoch 4, Step: 28700, Batch Loss: 1.788807, Tokens per Sec: 17829, Lr: 0.000300\n", "2021-05-08 17:34:22,380 - INFO - joeynmt.training - Epoch 4, Step: 28800, Batch Loss: 1.881927, Tokens per Sec: 17432, Lr: 0.000300\n", "2021-05-08 17:34:36,648 - INFO - joeynmt.training - Epoch 4, Step: 28900, Batch Loss: 1.671062, Tokens per Sec: 17732, Lr: 0.000300\n", "2021-05-08 17:34:50,867 - INFO - joeynmt.training - Epoch 4, Step: 29000, Batch Loss: 2.027172, Tokens per Sec: 17861, Lr: 0.000300\n", "2021-05-08 17:35:17,771 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:35:17,771 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:35:17,771 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:35:18,014 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:35:18,014 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:35:18,442 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:35:18,442 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:35:18,442 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:35:18,442 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:35:18,442 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:35:18,442 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:35:18,442 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - \tHypothesis: How did Jehovah resolve people to do and why ?\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - \tHypothesis: All the Gospel of John outside the three silver of the lying , showing false , and the very few of the silver was given to the text of unsure .\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Forever\n", "2021-05-08 17:35:18,443 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 29000: bleu: 30.60, loss: 48473.2969, ppl: 5.2896, duration: 27.5760s\n", "2021-05-08 17:35:32,694 - INFO - joeynmt.training - Epoch 4, Step: 29100, Batch Loss: 1.940491, Tokens per Sec: 17768, Lr: 0.000300\n", "2021-05-08 17:35:46,934 - INFO - joeynmt.training - Epoch 4, Step: 29200, Batch Loss: 1.808327, Tokens per Sec: 17701, Lr: 0.000300\n", "2021-05-08 17:36:01,252 - INFO - joeynmt.training - Epoch 4, Step: 29300, Batch Loss: 1.874627, Tokens per Sec: 17643, Lr: 0.000300\n", "2021-05-08 17:36:15,714 - INFO - joeynmt.training - Epoch 4, Step: 29400, Batch Loss: 1.762710, Tokens per Sec: 18019, Lr: 0.000300\n", "2021-05-08 17:36:29,947 - INFO - joeynmt.training - Epoch 4, Step: 29500, Batch Loss: 1.917207, Tokens per Sec: 17547, Lr: 0.000300\n", "2021-05-08 17:36:44,172 - INFO - joeynmt.training - Epoch 4, Step: 29600, Batch Loss: 1.860095, Tokens per Sec: 17738, Lr: 0.000300\n", "2021-05-08 17:36:58,585 - INFO - joeynmt.training - Epoch 4, Step: 29700, Batch Loss: 1.965474, Tokens per Sec: 17949, Lr: 0.000300\n", "2021-05-08 17:37:12,827 - INFO - joeynmt.training - Epoch 4, Step: 29800, Batch Loss: 1.790356, Tokens per Sec: 17696, Lr: 0.000300\n", "2021-05-08 17:37:27,253 - INFO - joeynmt.training - Epoch 4, Step: 29900, Batch Loss: 1.984344, Tokens per Sec: 17967, Lr: 0.000300\n", "2021-05-08 17:37:41,530 - INFO - joeynmt.training - Epoch 4, Step: 30000, Batch Loss: 2.452615, Tokens per Sec: 17720, Lr: 0.000300\n", "2021-05-08 17:38:10,838 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:38:10,838 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:38:10,838 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:38:11,095 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:38:11,095 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:38:11,522 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tHypothesis: How did Jehovah alone people do and why ?\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:38:11,523 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:38:11,524 - INFO - joeynmt.training - \tHypothesis: All John’s Gospel of John outside three sleep was supported by a lie , demonstrating false , and a few sleep was given to the text of uncertainty .\n", "2021-05-08 17:38:11,524 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:38:11,524 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:38:11,524 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:38:11,524 - INFO - joeynmt.training - \tHypothesis: Do Not Make a Forever\n", "2021-05-08 17:38:11,524 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 30000: bleu: 30.80, loss: 48295.7539, ppl: 5.2575, duration: 29.9931s\n", "2021-05-08 17:38:25,879 - INFO - joeynmt.training - Epoch 4, Step: 30100, Batch Loss: 1.814879, Tokens per Sec: 17969, Lr: 0.000300\n", "2021-05-08 17:38:40,272 - INFO - joeynmt.training - Epoch 4, Step: 30200, Batch Loss: 1.852694, Tokens per Sec: 17901, Lr: 0.000300\n", "2021-05-08 17:38:54,425 - INFO - joeynmt.training - Epoch 4, Step: 30300, Batch Loss: 1.931359, Tokens per Sec: 17496, Lr: 0.000300\n", "2021-05-08 17:39:08,694 - INFO - joeynmt.training - Epoch 4, Step: 30400, Batch Loss: 2.079514, Tokens per Sec: 17599, Lr: 0.000300\n", "2021-05-08 17:39:23,034 - INFO - joeynmt.training - Epoch 4, Step: 30500, Batch Loss: 1.624473, Tokens per Sec: 17958, Lr: 0.000300\n", "2021-05-08 17:39:37,259 - INFO - joeynmt.training - Epoch 4, Step: 30600, Batch Loss: 1.881672, Tokens per Sec: 17790, Lr: 0.000300\n", "2021-05-08 17:39:51,291 - INFO - joeynmt.training - Epoch 4, Step: 30700, Batch Loss: 1.934259, Tokens per Sec: 17271, Lr: 0.000300\n", "2021-05-08 17:40:05,564 - INFO - joeynmt.training - Epoch 4, Step: 30800, Batch Loss: 1.917694, Tokens per Sec: 17690, Lr: 0.000300\n", "2021-05-08 17:40:19,855 - INFO - joeynmt.training - Epoch 4, Step: 30900, Batch Loss: 1.921337, Tokens per Sec: 17672, Lr: 0.000300\n", "2021-05-08 17:40:34,265 - INFO - joeynmt.training - Epoch 4, Step: 31000, Batch Loss: 1.849440, Tokens per Sec: 18002, Lr: 0.000300\n", "2021-05-08 17:40:59,705 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:40:59,705 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:40:59,705 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:40:59,949 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:40:59,950 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:41:00,342 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:41:00,342 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:41:00,342 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:41:00,342 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:41:00,342 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tHypothesis: How is Jehovah alone for people who have done and why ?\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tHypothesis: All the Gospel of John outside the three sleep was supported by a false , displaying false , and a few lamented type of uncertainty .\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:41:00,343 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:41:00,344 - INFO - joeynmt.training - \tHypothesis: Do Not Make the Have of Having\n", "2021-05-08 17:41:00,344 - INFO - joeynmt.training - Validation result (greedy) at epoch 4, step 31000: bleu: 31.21, loss: 47655.5742, ppl: 5.1431, duration: 26.0779s\n", "2021-05-08 17:41:14,546 - INFO - joeynmt.training - Epoch 4, Step: 31100, Batch Loss: 1.734909, Tokens per Sec: 17582, Lr: 0.000300\n", "2021-05-08 17:41:22,897 - INFO - joeynmt.training - Epoch 4: total training loss 15218.79\n", "2021-05-08 17:41:22,897 - INFO - joeynmt.training - EPOCH 5\n", "2021-05-08 17:41:29,739 - INFO - joeynmt.training - Epoch 5, Step: 31200, Batch Loss: 2.012481, Tokens per Sec: 14387, Lr: 0.000300\n", "2021-05-08 17:41:44,025 - INFO - joeynmt.training - Epoch 5, Step: 31300, Batch Loss: 2.100360, Tokens per Sec: 17601, Lr: 0.000300\n", "2021-05-08 17:41:58,459 - INFO - joeynmt.training - Epoch 5, Step: 31400, Batch Loss: 1.929648, Tokens per Sec: 17657, Lr: 0.000300\n", "2021-05-08 17:42:12,836 - INFO - joeynmt.training - Epoch 5, Step: 31500, Batch Loss: 1.797204, Tokens per Sec: 17551, Lr: 0.000300\n", "2021-05-08 17:42:27,271 - INFO - joeynmt.training - Epoch 5, Step: 31600, Batch Loss: 1.734188, Tokens per Sec: 17665, Lr: 0.000300\n", "2021-05-08 17:42:41,636 - INFO - joeynmt.training - Epoch 5, Step: 31700, Batch Loss: 1.757982, Tokens per Sec: 17678, Lr: 0.000300\n", "2021-05-08 17:42:56,002 - INFO - joeynmt.training - Epoch 5, Step: 31800, Batch Loss: 1.804672, Tokens per Sec: 17507, Lr: 0.000300\n", "2021-05-08 17:43:10,507 - INFO - joeynmt.training - Epoch 5, Step: 31900, Batch Loss: 1.826306, Tokens per Sec: 17875, Lr: 0.000300\n", "2021-05-08 17:43:24,913 - INFO - joeynmt.training - Epoch 5, Step: 32000, Batch Loss: 1.835679, Tokens per Sec: 17736, Lr: 0.000300\n", "2021-05-08 17:43:51,250 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:43:51,250 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:43:51,250 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:43:51,494 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:43:51,494 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:43:51,965 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:43:51,965 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:43:51,965 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:43:51,965 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tHypothesis: How did Jehovah seek people and why ?\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tHypothesis: All the Gospel of John outside three sleep was supported by the lying , showing false , and a few sleep was given to the text of uncertainty .\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:43:51,966 - INFO - joeynmt.training - \tHypothesis: Do Not Make the Light\n", "2021-05-08 17:43:51,967 - INFO - joeynmt.training - Validation result (greedy) at epoch 5, step 32000: bleu: 31.27, loss: 47467.2656, ppl: 5.1099, duration: 27.0530s\n", "2021-05-08 17:44:06,293 - INFO - joeynmt.training - Epoch 5, Step: 32100, Batch Loss: 1.955969, Tokens per Sec: 17739, Lr: 0.000300\n", "2021-05-08 17:44:20,609 - INFO - joeynmt.training - Epoch 5, Step: 32200, Batch Loss: 1.819657, Tokens per Sec: 17937, Lr: 0.000300\n", "2021-05-08 17:44:34,904 - INFO - joeynmt.training - Epoch 5, Step: 32300, Batch Loss: 1.842564, Tokens per Sec: 17914, Lr: 0.000300\n", "2021-05-08 17:44:49,267 - INFO - joeynmt.training - Epoch 5, Step: 32400, Batch Loss: 1.980920, Tokens per Sec: 17815, Lr: 0.000300\n", "2021-05-08 17:45:03,525 - INFO - joeynmt.training - Epoch 5, Step: 32500, Batch Loss: 1.962983, Tokens per Sec: 17639, Lr: 0.000300\n", "2021-05-08 17:45:17,710 - INFO - joeynmt.training - Epoch 5, Step: 32600, Batch Loss: 1.856578, Tokens per Sec: 17738, Lr: 0.000300\n", "2021-05-08 17:45:31,791 - INFO - joeynmt.training - Epoch 5, Step: 32700, Batch Loss: 1.730071, Tokens per Sec: 17507, Lr: 0.000300\n", "2021-05-08 17:45:45,979 - INFO - joeynmt.training - Epoch 5, Step: 32800, Batch Loss: 1.858865, Tokens per Sec: 17624, Lr: 0.000300\n", "2021-05-08 17:46:00,272 - INFO - joeynmt.training - Epoch 5, Step: 32900, Batch Loss: 1.909782, Tokens per Sec: 17700, Lr: 0.000300\n", "2021-05-08 17:46:14,480 - INFO - joeynmt.training - Epoch 5, Step: 33000, Batch Loss: 1.819632, Tokens per Sec: 17574, Lr: 0.000300\n", "2021-05-08 17:46:40,356 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:46:40,356 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:46:40,356 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:46:40,603 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:46:40,604 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:46:41,046 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tHypothesis: How is Jehovah seeking people to do and why ?\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:46:41,047 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:46:41,048 - INFO - joeynmt.training - \tHypothesis: The entire Gospel of John outside three sleep was supported by the lie , showing false , and a few silver was given to a sort of uncertainty .\n", "2021-05-08 17:46:41,048 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:46:41,048 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:46:41,048 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:46:41,048 - INFO - joeynmt.training - \tHypothesis: Do Not Care for Everyone\n", "2021-05-08 17:46:41,048 - INFO - joeynmt.training - Validation result (greedy) at epoch 5, step 33000: bleu: 31.66, loss: 46946.3164, ppl: 5.0192, duration: 26.5672s\n", "2021-05-08 17:46:55,309 - INFO - joeynmt.training - Epoch 5, Step: 33100, Batch Loss: 1.850076, Tokens per Sec: 17784, Lr: 0.000300\n", "2021-05-08 17:47:09,380 - INFO - joeynmt.training - Epoch 5, Step: 33200, Batch Loss: 2.067301, Tokens per Sec: 17439, Lr: 0.000300\n", "2021-05-08 17:47:23,707 - INFO - joeynmt.training - Epoch 5, Step: 33300, Batch Loss: 1.909418, Tokens per Sec: 17873, Lr: 0.000300\n", "2021-05-08 17:47:38,019 - INFO - joeynmt.training - Epoch 5, Step: 33400, Batch Loss: 2.063337, Tokens per Sec: 17601, Lr: 0.000300\n", "2021-05-08 17:47:52,180 - INFO - joeynmt.training - Epoch 5, Step: 33500, Batch Loss: 1.779918, Tokens per Sec: 17473, Lr: 0.000300\n", "2021-05-08 17:48:06,444 - INFO - joeynmt.training - Epoch 5, Step: 33600, Batch Loss: 1.826568, Tokens per Sec: 17649, Lr: 0.000300\n", "2021-05-08 17:48:20,609 - INFO - joeynmt.training - Epoch 5, Step: 33700, Batch Loss: 1.770361, Tokens per Sec: 17819, Lr: 0.000300\n", "2021-05-08 17:48:34,878 - INFO - joeynmt.training - Epoch 5, Step: 33800, Batch Loss: 1.819435, Tokens per Sec: 17838, Lr: 0.000300\n", "2021-05-08 17:48:49,097 - INFO - joeynmt.training - Epoch 5, Step: 33900, Batch Loss: 1.843903, Tokens per Sec: 17707, Lr: 0.000300\n", "2021-05-08 17:49:03,359 - INFO - joeynmt.training - Epoch 5, Step: 34000, Batch Loss: 2.141565, Tokens per Sec: 17590, Lr: 0.000300\n", "2021-05-08 17:49:28,125 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:49:28,125 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:49:28,125 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:49:28,736 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:49:28,736 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:49:28,736 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:49:28,736 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:49:28,736 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tHypothesis: How did Jehovah seek people and why ?\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tHypothesis: The Gospel of John outside three sleep was supported by the lie , demonstrating false , and a few of the lack of the shadow of uncertainty .\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:49:28,737 - INFO - joeynmt.training - \tHypothesis: Do Not Care for Everlasting\n", "2021-05-08 17:49:28,738 - INFO - joeynmt.training - Validation result (greedy) at epoch 5, step 34000: bleu: 31.62, loss: 47115.6445, ppl: 5.0485, duration: 25.3782s\n", "2021-05-08 17:49:42,857 - INFO - joeynmt.training - Epoch 5, Step: 34100, Batch Loss: 1.713753, Tokens per Sec: 17534, Lr: 0.000300\n", "2021-05-08 17:49:57,126 - INFO - joeynmt.training - Epoch 5, Step: 34200, Batch Loss: 1.974585, Tokens per Sec: 17750, Lr: 0.000300\n", "2021-05-08 17:50:11,283 - INFO - joeynmt.training - Epoch 5, Step: 34300, Batch Loss: 1.927767, Tokens per Sec: 17671, Lr: 0.000300\n", "2021-05-08 17:50:25,513 - INFO - joeynmt.training - Epoch 5, Step: 34400, Batch Loss: 1.805864, Tokens per Sec: 17691, Lr: 0.000300\n", "2021-05-08 17:50:39,931 - INFO - joeynmt.training - Epoch 5, Step: 34500, Batch Loss: 1.794031, Tokens per Sec: 18063, Lr: 0.000300\n", "2021-05-08 17:50:54,217 - INFO - joeynmt.training - Epoch 5, Step: 34600, Batch Loss: 1.621045, Tokens per Sec: 17437, Lr: 0.000300\n", "2021-05-08 17:51:08,567 - INFO - joeynmt.training - Epoch 5, Step: 34700, Batch Loss: 1.839956, Tokens per Sec: 17603, Lr: 0.000300\n", "2021-05-08 17:51:22,834 - INFO - joeynmt.training - Epoch 5, Step: 34800, Batch Loss: 2.051074, Tokens per Sec: 17858, Lr: 0.000300\n", "2021-05-08 17:51:36,966 - INFO - joeynmt.training - Epoch 5, Step: 34900, Batch Loss: 1.849059, Tokens per Sec: 17699, Lr: 0.000300\n", "2021-05-08 17:51:51,210 - INFO - joeynmt.training - Epoch 5, Step: 35000, Batch Loss: 1.964615, Tokens per Sec: 17630, Lr: 0.000300\n", "2021-05-08 17:52:18,039 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:52:18,039 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:52:18,039 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:52:18,284 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:52:18,285 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:52:18,750 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:52:18,751 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:52:18,751 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:52:18,751 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:52:18,751 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:52:18,751 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:52:18,751 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:52:18,751 - INFO - joeynmt.training - \tHypothesis: How is Jehovah seeking people and why ?\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - \tHypothesis: All John’s Gospel outside three sleep was supported by the lying , indicating false , and a few sleep was given to the end of uncertainty .\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - \tHypothesis: Do Not Purpose to Be Ender\n", "2021-05-08 17:52:18,752 - INFO - joeynmt.training - Validation result (greedy) at epoch 5, step 35000: bleu: 32.21, loss: 46463.6641, ppl: 4.9367, duration: 27.5418s\n", "2021-05-08 17:52:32,863 - INFO - joeynmt.training - Epoch 5, Step: 35100, Batch Loss: 1.699062, Tokens per Sec: 17811, Lr: 0.000300\n", "2021-05-08 17:52:47,247 - INFO - joeynmt.training - Epoch 5, Step: 35200, Batch Loss: 1.752195, Tokens per Sec: 17719, Lr: 0.000300\n", "2021-05-08 17:53:01,480 - INFO - joeynmt.training - Epoch 5, Step: 35300, Batch Loss: 1.701652, Tokens per Sec: 17533, Lr: 0.000300\n", "2021-05-08 17:53:15,620 - INFO - joeynmt.training - Epoch 5, Step: 35400, Batch Loss: 1.560230, Tokens per Sec: 17793, Lr: 0.000300\n", "2021-05-08 17:53:29,959 - INFO - joeynmt.training - Epoch 5, Step: 35500, Batch Loss: 1.842874, Tokens per Sec: 17662, Lr: 0.000300\n", "2021-05-08 17:53:44,187 - INFO - joeynmt.training - Epoch 5, Step: 35600, Batch Loss: 1.905869, Tokens per Sec: 17563, Lr: 0.000300\n", "2021-05-08 17:53:58,427 - INFO - joeynmt.training - Epoch 5, Step: 35700, Batch Loss: 1.978703, Tokens per Sec: 17980, Lr: 0.000300\n", "2021-05-08 17:54:12,829 - INFO - joeynmt.training - Epoch 5, Step: 35800, Batch Loss: 1.852114, Tokens per Sec: 17859, Lr: 0.000300\n", "2021-05-08 17:54:27,274 - INFO - joeynmt.training - Epoch 5, Step: 35900, Batch Loss: 1.884216, Tokens per Sec: 17928, Lr: 0.000300\n", "2021-05-08 17:54:41,438 - INFO - joeynmt.training - Epoch 5, Step: 36000, Batch Loss: 1.882546, Tokens per Sec: 17769, Lr: 0.000300\n", "2021-05-08 17:55:08,287 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:55:08,287 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:55:08,287 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:55:08,542 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:55:08,542 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:55:08,926 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:55:08,926 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:55:08,926 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:55:08,926 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tHypothesis: How did Jehovah seek people to do and why ?\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tHypothesis: All John’s Gospel outside three sleep was supported by lying , showing false , and a few were given a sort of uncertainty .\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:55:08,927 - INFO - joeynmt.training - \tHypothesis: Do Not Expect to Be perfect\n", "2021-05-08 17:55:08,928 - INFO - joeynmt.training - Validation result (greedy) at epoch 5, step 36000: bleu: 32.03, loss: 46366.1250, ppl: 4.9201, duration: 27.4889s\n", "2021-05-08 17:55:23,449 - INFO - joeynmt.training - Epoch 5, Step: 36100, Batch Loss: 1.801080, Tokens per Sec: 17702, Lr: 0.000300\n", "2021-05-08 17:55:37,961 - INFO - joeynmt.training - Epoch 5, Step: 36200, Batch Loss: 1.832374, Tokens per Sec: 17646, Lr: 0.000300\n", "2021-05-08 17:55:52,265 - INFO - joeynmt.training - Epoch 5, Step: 36300, Batch Loss: 1.865107, Tokens per Sec: 17357, Lr: 0.000300\n", "2021-05-08 17:56:06,675 - INFO - joeynmt.training - Epoch 5, Step: 36400, Batch Loss: 1.656384, Tokens per Sec: 17451, Lr: 0.000300\n", "2021-05-08 17:56:20,929 - INFO - joeynmt.training - Epoch 5, Step: 36500, Batch Loss: 1.814934, Tokens per Sec: 17585, Lr: 0.000300\n", "2021-05-08 17:56:35,251 - INFO - joeynmt.training - Epoch 5, Step: 36600, Batch Loss: 1.752064, Tokens per Sec: 17728, Lr: 0.000300\n", "2021-05-08 17:56:49,608 - INFO - joeynmt.training - Epoch 5, Step: 36700, Batch Loss: 1.727550, Tokens per Sec: 17787, Lr: 0.000300\n", "2021-05-08 17:57:03,807 - INFO - joeynmt.training - Epoch 5, Step: 36800, Batch Loss: 1.966657, Tokens per Sec: 17679, Lr: 0.000300\n", "2021-05-08 17:57:18,099 - INFO - joeynmt.training - Epoch 5, Step: 36900, Batch Loss: 1.811878, Tokens per Sec: 17609, Lr: 0.000300\n", "2021-05-08 17:57:32,519 - INFO - joeynmt.training - Epoch 5, Step: 37000, Batch Loss: 2.065411, Tokens per Sec: 17671, Lr: 0.000300\n", "2021-05-08 17:57:58,861 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 17:57:58,861 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 17:57:58,862 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 17:57:59,118 - INFO - joeynmt.training - Hooray! New best validation result [ppl]!\n", "2021-05-08 17:57:59,119 - INFO - joeynmt.training - Saving new checkpoint.\n", "2021-05-08 17:57:59,551 - INFO - joeynmt.training - Example #0\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - Example #1\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - \tHypothesis: How is Jehovah seeking people to do and why ?\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - Example #2\n", "2021-05-08 17:57:59,552 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 17:57:59,553 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 17:57:59,553 - INFO - joeynmt.training - \tHypothesis: All John’s Gospel outside three sleep was supported by the lie , demonstrating false , and a few of the lamb was given to the uncertainty .\n", "2021-05-08 17:57:59,553 - INFO - joeynmt.training - Example #3\n", "2021-05-08 17:57:59,553 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 17:57:59,553 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 17:57:59,553 - INFO - joeynmt.training - \tHypothesis: Do Not expect to Be Having Everything\n", "2021-05-08 17:57:59,553 - INFO - joeynmt.training - Validation result (greedy) at epoch 5, step 37000: bleu: 31.95, loss: 46000.8320, ppl: 4.8588, duration: 27.0335s\n", "2021-05-08 17:58:13,736 - INFO - joeynmt.training - Epoch 5, Step: 37100, Batch Loss: 1.950401, Tokens per Sec: 17627, Lr: 0.000300\n", "2021-05-08 17:58:27,981 - INFO - joeynmt.training - Epoch 5, Step: 37200, Batch Loss: 1.848431, Tokens per Sec: 17634, Lr: 0.000300\n", "2021-05-08 17:58:42,121 - INFO - joeynmt.training - Epoch 5, Step: 37300, Batch Loss: 1.706244, Tokens per Sec: 17611, Lr: 0.000300\n", "2021-05-08 17:58:56,570 - INFO - joeynmt.training - Epoch 5, Step: 37400, Batch Loss: 1.872762, Tokens per Sec: 17924, Lr: 0.000300\n", "2021-05-08 17:59:10,897 - INFO - joeynmt.training - Epoch 5, Step: 37500, Batch Loss: 1.958673, Tokens per Sec: 17713, Lr: 0.000300\n", "2021-05-08 17:59:25,144 - INFO - joeynmt.training - Epoch 5, Step: 37600, Batch Loss: 2.037944, Tokens per Sec: 17812, Lr: 0.000300\n", "2021-05-08 17:59:39,496 - INFO - joeynmt.training - Epoch 5, Step: 37700, Batch Loss: 1.855443, Tokens per Sec: 17746, Lr: 0.000300\n", "2021-05-08 17:59:53,879 - INFO - joeynmt.training - Epoch 5, Step: 37800, Batch Loss: 1.892480, Tokens per Sec: 17907, Lr: 0.000300\n", "2021-05-08 18:00:08,102 - INFO - joeynmt.training - Epoch 5, Step: 37900, Batch Loss: 1.909298, Tokens per Sec: 17719, Lr: 0.000300\n", "2021-05-08 18:00:22,349 - INFO - joeynmt.training - Epoch 5, Step: 38000, Batch Loss: 1.772264, Tokens per Sec: 17623, Lr: 0.000300\n", "2021-05-08 18:00:47,713 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 18:00:47,714 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 18:00:47,714 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 18:00:48,348 - INFO - joeynmt.training - Example #0\n", "2021-05-08 18:00:48,348 - INFO - joeynmt.training - \tSource: Haana kuedza kutora basa racho .\n", "2021-05-08 18:00:48,348 - INFO - joeynmt.training - \tReference: She did not attempt to take over .\n", "2021-05-08 18:00:48,348 - INFO - joeynmt.training - \tHypothesis: He did not try to take the work .\n", "2021-05-08 18:00:48,348 - INFO - joeynmt.training - Example #1\n", "2021-05-08 18:00:48,348 - INFO - joeynmt.training - \tSource: Jehovha ari kutsvaga vanhu vakaita sei uye nei ?\n", "2021-05-08 18:00:48,348 - INFO - joeynmt.training - \tReference: For whom is Jehovah searching , and why ?\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - \tHypothesis: How and why are Jehovah seeking for people ?\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - Example #2\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - \tSource: Evhangeri yose yaJohane kunze kwemitsara mitatu yakatsigirwa nechuma chitema , kuratidzira kuva yenhema , uye shomanene yakasara yakapiwa chuma chegireyi chekusava nechokwadi .\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - \tReference: All but three lines of John’s Gospel got the black bead vote , denoting falsification , and the bit that remained was accorded the gray bead of doubt .\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - \tHypothesis: All John’s Gospel outside three sleep was supported by the lying , showing false , and a few of the remained ones were given to the text of uncertainty .\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - Example #3\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - \tSource: Usatarisira Kuti Vave Vakakwana\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - \tReference: Do Not Demand Perfection\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - \tHypothesis: Do Not Expect to Be Have Everything\n", "2021-05-08 18:00:48,349 - INFO - joeynmt.training - Validation result (greedy) at epoch 5, step 38000: bleu: 32.31, loss: 46013.3945, ppl: 4.8609, duration: 26.0004s\n", "2021-05-08 18:01:02,762 - INFO - joeynmt.training - Epoch 5, Step: 38100, Batch Loss: 1.880070, Tokens per Sec: 17761, Lr: 0.000300\n", "2021-05-08 18:01:17,186 - INFO - joeynmt.training - Epoch 5, Step: 38200, Batch Loss: 1.813562, Tokens per Sec: 17832, Lr: 0.000300\n", "2021-05-08 18:01:31,511 - INFO - joeynmt.training - Epoch 5, Step: 38300, Batch Loss: 1.561175, Tokens per Sec: 17569, Lr: 0.000300\n", "2021-05-08 18:01:45,734 - INFO - joeynmt.training - Epoch 5, Step: 38400, Batch Loss: 1.705224, Tokens per Sec: 17836, Lr: 0.000300\n", "2021-05-08 18:01:59,998 - INFO - joeynmt.training - Epoch 5, Step: 38500, Batch Loss: 1.719180, Tokens per Sec: 17515, Lr: 0.000300\n", "2021-05-08 18:02:14,331 - INFO - joeynmt.training - Epoch 5, Step: 38600, Batch Loss: 1.911385, Tokens per Sec: 17774, Lr: 0.000300\n", "2021-05-08 18:02:28,518 - INFO - joeynmt.training - Epoch 5, Step: 38700, Batch Loss: 1.902857, Tokens per Sec: 17730, Lr: 0.000300\n", "2021-05-08 18:02:42,760 - INFO - joeynmt.training - Epoch 5, Step: 38800, Batch Loss: 1.671967, Tokens per Sec: 17696, Lr: 0.000300\n", "2021-05-08 18:02:56,956 - INFO - joeynmt.training - Epoch 5, Step: 38900, Batch Loss: 1.826745, Tokens per Sec: 17592, Lr: 0.000300\n", "2021-05-08 18:03:04,017 - INFO - joeynmt.training - Epoch 5: total training loss 14423.56\n", "2021-05-08 18:03:04,017 - INFO - joeynmt.training - Training ended after 5 epochs.\n", "2021-05-08 18:03:04,017 - INFO - joeynmt.training - Best validation result (greedy) at step 37000: 4.86 ppl.\n", "2021-05-08 18:03:04,034 - INFO - joeynmt.prediction - Process device: cuda, n_gpu: 1, batch_size per device: 18000 (with beam_size)\n", "2021-05-08 18:03:04,197 - INFO - joeynmt.model - Building an encoder-decoder model...\n", "2021-05-08 18:03:04,388 - INFO - joeynmt.model - Enc-dec model built.\n", "2021-05-08 18:03:04,453 - INFO - joeynmt.prediction - Decoding on dev set (data/snen/dev.bpe.en)...\n", "2021-05-08 18:03:32,394 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 18:03:32,394 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 18:03:32,394 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 18:03:32,658 - INFO - joeynmt.prediction - dev bleu[13a]: 32.83 [Beam search decoding with beam size = 5 and alpha = 1.0]\n", "2021-05-08 18:03:32,659 - INFO - joeynmt.prediction - Translations saved to: models/snen_reverse_transformer/00037000.hyps.dev\n", "2021-05-08 18:03:32,659 - INFO - joeynmt.prediction - Decoding on test set (data/snen/test.bpe.en)...\n", "2021-05-08 18:04:13,144 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 18:04:13,145 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 18:04:13,145 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 18:04:13,722 - INFO - joeynmt.prediction - test bleu[13a]: 38.25 [Beam search decoding with beam size = 5 and alpha = 1.0]\n", "2021-05-08 18:04:13,723 - INFO - joeynmt.prediction - Translations saved to: models/snen_reverse_transformer/00037000.hyps.test\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "MBoDS09JM807" }, "source": [ "# Copy the created models from the notebook storage to google drive for persistant storage \n", "!cp -r joeynmt/models/${tgt}${src}_reverse_transformer/* \"$gdrive_path/models/${src}${tgt}_reverse_transformer/\"" ], "execution_count": null, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "n94wlrCjVc17", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "a3322b1a-fe85-4e4f-8912-12b0c698389c" }, "source": [ "# Output our validation accuracy\n", "! cat \"$gdrive_path/models/${src}${tgt}_reverse_transformer/validations.txt\"" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "Steps: 1000\tLoss: 128584.38281\tPPL: 82.98895\tbleu: 1.27631\tLR: 0.00030000\t*\n", "Steps: 2000\tLoss: 110739.48438\tPPL: 44.94684\tbleu: 1.82098\tLR: 0.00030000\t*\n", "Steps: 3000\tLoss: 97636.22656\tPPL: 28.65124\tbleu: 4.49353\tLR: 0.00030000\t*\n", "Steps: 4000\tLoss: 88936.70312\tPPL: 21.24760\tbleu: 7.47218\tLR: 0.00030000\t*\n", "Steps: 5000\tLoss: 82720.57812\tPPL: 17.16087\tbleu: 10.77537\tLR: 0.00030000\t*\n", "Steps: 6000\tLoss: 77637.48438\tPPL: 14.41046\tbleu: 13.05003\tLR: 0.00030000\t*\n", "Steps: 7000\tLoss: 73211.59375\tPPL: 12.37728\tbleu: 14.96968\tLR: 0.00030000\t*\n", "Steps: 8000\tLoss: 70076.27344\tPPL: 11.11304\tbleu: 17.02415\tLR: 0.00030000\t*\n", "Steps: 9000\tLoss: 67605.07031\tPPL: 10.20827\tbleu: 18.28531\tLR: 0.00030000\t*\n", "Steps: 10000\tLoss: 65946.96094\tPPL: 9.64287\tbleu: 20.08413\tLR: 0.00030000\t*\n", "Steps: 11000\tLoss: 63334.93359\tPPL: 8.81503\tbleu: 21.08849\tLR: 0.00030000\t*\n", "Steps: 12000\tLoss: 61446.67578\tPPL: 8.26120\tbleu: 22.31000\tLR: 0.00030000\t*\n", "Steps: 13000\tLoss: 60129.86719\tPPL: 7.89570\tbleu: 22.60980\tLR: 0.00030000\t*\n", "Steps: 14000\tLoss: 58738.91016\tPPL: 7.52717\tbleu: 24.26839\tLR: 0.00030000\t*\n", "Steps: 15000\tLoss: 57507.43359\tPPL: 7.21528\tbleu: 24.77417\tLR: 0.00030000\t*\n", "Steps: 16000\tLoss: 56698.91016\tPPL: 7.01756\tbleu: 25.17446\tLR: 0.00030000\t*\n", "Steps: 17000\tLoss: 55670.30469\tPPL: 6.77385\tbleu: 26.10210\tLR: 0.00030000\t*\n", "Steps: 18000\tLoss: 54687.66797\tPPL: 6.54893\tbleu: 27.05707\tLR: 0.00030000\t*\n", "Steps: 19000\tLoss: 54047.32422\tPPL: 6.40639\tbleu: 27.06940\tLR: 0.00030000\t*\n", "Steps: 20000\tLoss: 53140.55859\tPPL: 6.20985\tbleu: 27.31863\tLR: 0.00030000\t*\n", "Steps: 21000\tLoss: 52137.67969\tPPL: 5.99948\tbleu: 27.92378\tLR: 0.00030000\t*\n", "Steps: 22000\tLoss: 51759.23438\tPPL: 5.92196\tbleu: 28.65558\tLR: 0.00030000\t*\n", "Steps: 23000\tLoss: 51126.46875\tPPL: 5.79458\tbleu: 28.59703\tLR: 0.00030000\t*\n", "Steps: 24000\tLoss: 50882.13672\tPPL: 5.74613\tbleu: 28.93993\tLR: 0.00030000\t*\n", "Steps: 25000\tLoss: 50395.80469\tPPL: 5.65090\tbleu: 29.36012\tLR: 0.00030000\t*\n", "Steps: 26000\tLoss: 49897.92188\tPPL: 5.55504\tbleu: 29.73098\tLR: 0.00030000\t*\n", "Steps: 27000\tLoss: 49471.66406\tPPL: 5.47426\tbleu: 30.09011\tLR: 0.00030000\t*\n", "Steps: 28000\tLoss: 48979.00391\tPPL: 5.38236\tbleu: 29.90271\tLR: 0.00030000\t*\n", "Steps: 29000\tLoss: 48473.29688\tPPL: 5.28963\tbleu: 30.59818\tLR: 0.00030000\t*\n", "Steps: 30000\tLoss: 48295.75391\tPPL: 5.25746\tbleu: 30.80021\tLR: 0.00030000\t*\n", "Steps: 31000\tLoss: 47655.57422\tPPL: 5.14306\tbleu: 31.20927\tLR: 0.00030000\t*\n", "Steps: 32000\tLoss: 47467.26562\tPPL: 5.10989\tbleu: 31.26790\tLR: 0.00030000\t*\n", "Steps: 33000\tLoss: 46946.31641\tPPL: 5.01922\tbleu: 31.65570\tLR: 0.00030000\t*\n", "Steps: 34000\tLoss: 47115.64453\tPPL: 5.04852\tbleu: 31.61876\tLR: 0.00030000\t\n", "Steps: 35000\tLoss: 46463.66406\tPPL: 4.93666\tbleu: 32.20734\tLR: 0.00030000\t*\n", "Steps: 36000\tLoss: 46366.12500\tPPL: 4.92014\tbleu: 32.02944\tLR: 0.00030000\t*\n", "Steps: 37000\tLoss: 46000.83203\tPPL: 4.85877\tbleu: 31.94698\tLR: 0.00030000\t*\n", "Steps: 38000\tLoss: 46013.39453\tPPL: 4.86086\tbleu: 32.31360\tLR: 0.00030000\t\n" ], "name": "stdout" } ] }, { "cell_type": "code", "metadata": { "id": "66WhRE9lIhoD", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "ccfffe9e-1f05-4b50-fb1e-385879e80605" }, "source": [ "# Test our model\n", "! cd joeynmt; python3 -m joeynmt test \"$gdrive_path/models/${src}${tgt}_reverse_transformer/config.yaml\"" ], "execution_count": null, "outputs": [ { "output_type": "stream", "text": [ "2021-05-08 18:14:20,323 - INFO - root - Hello! This is Joey-NMT (version 1.3).\n", "2021-05-08 18:14:20,323 - INFO - joeynmt.data - Building vocabulary...\n", "2021-05-08 18:14:20,604 - INFO - joeynmt.data - Loading dev data...\n", "2021-05-08 18:14:20,614 - INFO - joeynmt.data - Loading test data...\n", "2021-05-08 18:14:20,636 - INFO - joeynmt.data - Data loaded.\n", "2021-05-08 18:14:20,677 - INFO - joeynmt.prediction - Process device: cuda, n_gpu: 1, batch_size per device: 18000 (with beam_size)\n", "2021-05-08 18:14:24,041 - INFO - joeynmt.model - Building an encoder-decoder model...\n", "2021-05-08 18:14:24,242 - INFO - joeynmt.model - Enc-dec model built.\n", "2021-05-08 18:14:24,313 - INFO - joeynmt.prediction - Decoding on dev set (data/snen/dev.bpe.en)...\n", "2021-05-08 18:14:49,837 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 18:14:49,838 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 18:14:49,838 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 18:14:50,079 - INFO - joeynmt.prediction - dev bleu[13a]: 33.36 [Beam search decoding with beam size = 5 and alpha = 1.0]\n", "2021-05-08 18:14:50,079 - INFO - joeynmt.prediction - Decoding on test set (data/snen/test.bpe.en)...\n", "2021-05-08 18:15:28,199 - WARNING - sacrebleu - That's 100 lines that end in a tokenized period ('.')\n", "2021-05-08 18:15:28,199 - WARNING - sacrebleu - It looks like you forgot to detokenize your test data, which may hurt your score.\n", "2021-05-08 18:15:28,200 - WARNING - sacrebleu - If you insist your data is detokenized, or don't care, you can suppress this message with '--force'.\n", "2021-05-08 18:15:28,772 - INFO - joeynmt.prediction - test bleu[13a]: 38.42 [Beam search decoding with beam size = 5 and alpha = 1.0]\n" ], "name": "stdout" } ] } ] }