{"nbformat":4,"nbformat_minor":0,"metadata":{"accelerator":"GPU","colab":{"name":"en_sn.ipynb","provenance":[{"file_id":"https://github.com/masakhane-io/masakhane/blob/66d42b6712d4cba18ba2abbbec0b663792f3ea99/starter_notebook.ipynb","timestamp":1572518343736}],"collapsed_sections":[],"toc_visible":true},"kernelspec":{"display_name":"Python 3.7","language":"python","name":"python3"},"language_info":{"codemirror_mode":{"name":"ipython","version":3},"file_extension":".py","mimetype":"text/x-python","name":"python","nbconvert_exporter":"python","pygments_lexer":"ipython3","version":"3.7.1"}},"cells":[{"cell_type":"markdown","metadata":{"colab_type":"text","id":"Igc5itf-xMGj"},"source":["# Masakhane - Machine Translation for African Languages (Using JoeyNMT)"]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"x4fXCKCf36IK"},"source":["## Note before beginning:\n","### - The idea is that you should be able to make minimal changes to this in order to get SOME result for your own translation corpus. \n","\n","### - The tl;dr: Go to the **\"TODO\"** comments which will tell you what to update to get up and running\n","\n","### - If you actually want to have a clue what you're doing, read the text and peek at the links\n","\n","### - With 100 epochs, it should take around 7 hours to run in Google Colab\n","\n","### - Once you've gotten a result for your language, please attach and email your notebook that generated it to masakhanetranslation@gmail.com\n","\n","### - If you care enough and get a chance, doing a brief background on your language would be amazing. See examples in [(Martinus, 2019)](https://arxiv.org/abs/1906.05685)"]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"l929HimrxS0a"},"source":["## Retrieve your data & make a parallel corpus\n","\n","If you are wanting to use the JW300 data referenced on the Masakhane website or in our GitHub repo, you can use `opus-tools` to convert the data into a convenient format. `opus_read` from that package provides a convenient tool for reading the native aligned XML files and to convert them to TMX format. The tool can also be used to fetch relevant files from OPUS on the fly and to filter the data as necessary. [Read the documentation](https://pypi.org/project/opustools-pkg/) for more details.\n","\n","Once you have your corpus files in TMX format (an xml structure which will include the sentences in your target language and your source language in a single file), we recommend reading them into a pandas dataframe. Thankfully, Jade wrote a silly `tmx2dataframe` package which converts your tmx file to a pandas dataframe. "]},{"cell_type":"code","metadata":{"colab_type":"code","id":"oGRmDELn7Az0","outputId":"030f8ea6-16cf-4119-8488-442b89820f1c","executionInfo":{"status":"ok","timestamp":1579196121362,"user_tz":-120,"elapsed":55031,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":122}},"source":["from google.colab import drive\n","drive.mount('/content/drive')"],"execution_count":1,"outputs":[{"output_type":"stream","text":["Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n","\n","Enter your authorization code:\n","··········\n","Mounted at /content/drive\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"Cn3tgQLzUxwn","colab":{}},"source":["# TODO: Set your source and target languages. Keep in mind, these traditionally use language codes as found here:\n","# These will also become the suffix's of all vocab and corpus files used throughout\n","import os\n","source_language = \"en\"\n","target_language = \"sn\" \n","lc = False # If True, lowercase the data.\n","seed = 42 # Random seed for shuffling.\n","tag = \"baseline\" # Give a unique name to your folder - this is to ensure you don't rewrite any models you've already submitted\n","\n","os.environ[\"src\"] = source_language # Sets them in bash as well, since we often use bash scripts\n","os.environ[\"tgt\"] = target_language\n","os.environ[\"tag\"] = tag\n","\n","# This will save it to a folder in our gdrive instead!\n","!mkdir -p \"/content/drive/My Drive/masakhane/$src-$tgt-$tag\"\n","os.environ[\"gdrive_path\"] = \"/content/drive/My Drive/masakhane/%s-%s-%s\" % (source_language, target_language, tag)"],"execution_count":0,"outputs":[]},{"cell_type":"code","metadata":{"colab_type":"code","id":"kBSgJHEw7Nvx","outputId":"cc3d7f81-7869-494c-90c3-f2d6b79437d4","executionInfo":{"status":"ok","timestamp":1575546215822,"user_tz":-120,"elapsed":4276,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":34}},"source":["!echo $gdrive_path"],"execution_count":0,"outputs":[{"output_type":"stream","text":["/content/drive/My Drive/masakhane/en-sn-baseline\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"gA75Fs9ys8Y9","outputId":"fb304629-cd4c-4f71-cd5e-47fb9bc1c3d0","executionInfo":{"status":"ok","timestamp":1575546226223,"user_tz":-120,"elapsed":7500,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":102}},"source":["# Install opus-tools\n","! pip install opustools-pkg"],"execution_count":0,"outputs":[{"output_type":"stream","text":["Collecting opustools-pkg\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/6c/9f/e829a0cceccc603450cd18e1ff80807b6237a88d9a8df2c0bb320796e900/opustools_pkg-0.0.52-py3-none-any.whl (80kB)\n","\r\u001b[K |████ | 10kB 19.9MB/s eta 0:00:01\r\u001b[K |████████ | 20kB 4.2MB/s eta 0:00:01\r\u001b[K |████████████▏ | 30kB 6.1MB/s eta 0:00:01\r\u001b[K |████████████████▏ | 40kB 7.7MB/s eta 0:00:01\r\u001b[K |████████████████████▎ | 51kB 5.0MB/s eta 0:00:01\r\u001b[K |████████████████████████▎ | 61kB 5.9MB/s eta 0:00:01\r\u001b[K |████████████████████████████▎ | 71kB 6.7MB/s eta 0:00:01\r\u001b[K |████████████████████████████████| 81kB 4.9MB/s \n","\u001b[?25hInstalling collected packages: opustools-pkg\n","Successfully installed opustools-pkg-0.0.52\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"xq-tDZVks7ZD","outputId":"279b504b-f5e1-4efb-9639-37edcd118110","executionInfo":{"status":"ok","timestamp":1575546448837,"user_tz":-120,"elapsed":178664,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":204}},"source":["# Downloading our corpus\n","! opus_read -d JW300 -s $src -t $tgt -wm moses -w jw300.$src jw300.$tgt -q\n","\n","# extract the corpus file\n","! gunzip JW300_latest_xml_$src-$tgt.xml.gz"],"execution_count":0,"outputs":[{"output_type":"stream","text":["\n","Alignment file /proj/nlpl/data/OPUS/JW300/latest/xml/en-sn.xml.gz not found. The following files are available for downloading:\n","\n"," 6 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/en-sn.xml.gz\n"," 263 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/en.zip\n"," 69 MB https://object.pouta.csc.fi/OPUS-JW300/v1/xml/sn.zip\n","\n"," 339 MB Total size\n","./JW300_latest_xml_en-sn.xml.gz ... 100% of 6 MB\n","./JW300_latest_xml_en.zip ... 100% of 263 MB\n","./JW300_latest_xml_sn.zip ... 100% of 69 MB\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"n48GDRnP8y2G","outputId":"50c6a203-d142-487e-f41a-b70d8f8fed37","executionInfo":{"status":"ok","timestamp":1575546492852,"user_tz":-120,"elapsed":18931,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":578}},"source":["# Download the global test set.\n","! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-any.en\n"," \n","# And the specific test set for this language pair.\n","os.environ[\"trg\"] = target_language \n","os.environ[\"src\"] = source_language \n","\n","! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-$trg.en \n","! mv test.en-$trg.en test.en\n","! wget https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-$trg.$trg \n","! mv test.en-$trg.$trg test.$trg"],"execution_count":0,"outputs":[{"output_type":"stream","text":["--2019-12-05 11:47:56-- https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-any.en\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 277791 (271K) [text/plain]\n","Saving to: ‘test.en-any.en’\n","\n","\rtest.en-any.en 0%[ ] 0 --.-KB/s \rtest.en-any.en 100%[===================>] 271.28K --.-KB/s in 0.02s \n","\n","2019-12-05 11:47:56 (11.8 MB/s) - ‘test.en-any.en’ saved [277791/277791]\n","\n","--2019-12-05 11:47:59-- https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-sn.en\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 206539 (202K) [text/plain]\n","Saving to: ‘test.en-sn.en’\n","\n","test.en-sn.en 100%[===================>] 201.70K --.-KB/s in 0.02s \n","\n","2019-12-05 11:47:59 (11.6 MB/s) - ‘test.en-sn.en’ saved [206539/206539]\n","\n","--2019-12-05 11:48:06-- https://raw.githubusercontent.com/juliakreutzer/masakhane/master/jw300_utils/test/test.en-sn.sn\n","Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...\n","Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.\n","HTTP request sent, awaiting response... 200 OK\n","Length: 215910 (211K) [text/plain]\n","Saving to: ‘test.en-sn.sn’\n","\n","test.en-sn.sn 100%[===================>] 210.85K --.-KB/s in 0.02s \n","\n","2019-12-05 11:48:06 (9.59 MB/s) - ‘test.en-sn.sn’ saved [215910/215910]\n","\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"NqDG-CI28y2L","outputId":"92f49b99-d5c4-4015-8a2c-196f0963a96d","executionInfo":{"status":"ok","timestamp":1575546541116,"user_tz":-120,"elapsed":1360,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":34}},"source":["# Read the test data to filter from train and dev splits.\n","# Store english portion in set for quick filtering checks.\n","en_test_sents = set()\n","filter_test_sents = \"test.en-any.en\"\n","j = 0\n","with open(filter_test_sents) as f:\n"," for line in f:\n"," en_test_sents.add(line.strip())\n"," j += 1\n","print('Loaded {} global test sentences to filter from the training/dev data.'.format(j))"],"execution_count":0,"outputs":[{"output_type":"stream","text":["Loaded 3571 global test sentences to filter from the training/dev data.\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"3CNdwLBCfSIl","outputId":"0b64470e-735b-4787-df02-6cee5ce06a94","executionInfo":{"status":"ok","timestamp":1575547419802,"user_tz":-120,"elapsed":56926,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":159}},"source":["import pandas as pd\n","\n","# TMX file to dataframe\n","source_file = 'jw300.' + source_language\n","target_file = 'jw300.' + target_language\n","\n","source = []\n","target = []\n","skip_lines = [] # Collect the line numbers of the source portion to skip the same lines for the target portion.\n","with open(source_file) as f:\n"," for i, line in enumerate(f):\n"," # Skip sentences that are contained in the test set.\n"," if line.strip() not in en_test_sents:\n"," source.append(line.strip())\n"," else:\n"," skip_lines.append(i) \n","with open(target_file) as f:\n"," for j, line in enumerate(f):\n"," # Only add to corpus if corresponding source was not skipped.\n"," if j not in skip_lines:\n"," target.append(line.strip())\n"," \n","print('Loaded data and skipped {}/{} lines since contained in test set.'.format(len(skip_lines), i))\n"," \n","df = pd.DataFrame(zip(source, target), columns=['source_sentence', 'target_sentence'])\n","df.head(3)"],"execution_count":0,"outputs":[{"output_type":"stream","text":["Loaded data and skipped 6157/786529 lines since contained in test set.\n"],"name":"stdout"},{"output_type":"execute_result","data":{"text/html":["
\n","\n","\n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n"," \n","
source_sentencetarget_sentence
0Young People Ask . . .Vechiduku Vanobvunza Kuti . . .
1Why Do I Lose My Temper ?Neiko Ndichitsamwa ?
2“ When I’m angry , I’m furious , and you would...“ Apo ndinoshatirwa , ndinotyisa , uye haungad...
\n","
"],"text/plain":[" source_sentence target_sentence\n","0 Young People Ask . . . Vechiduku Vanobvunza Kuti . . .\n","1 Why Do I Lose My Temper ? Neiko Ndichitsamwa ?\n","2 “ When I’m angry , I’m furious , and you would... “ Apo ndinoshatirwa , ndinotyisa , uye haungad..."]},"metadata":{"tags":[]},"execution_count":9}]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"YkuK3B4p2AkN"},"source":["## Pre-processing and export\n","\n","It is generally a good idea to remove duplicate translations and conflicting translations from the corpus. In practice, these public corpora include some number of these that need to be cleaned.\n","\n","In addition we will split our data into dev/test/train and export to the filesystem."]},{"cell_type":"code","metadata":{"colab_type":"code","id":"M_2ouEOH1_1q","outputId":"35293510-3cb0-43a4-c209-5c549f6f67e1","executionInfo":{"status":"ok","timestamp":1575229721680,"user_tz":-120,"elapsed":332708,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":442}},"source":["#import numpy\n","import numpy as np\n","# drop duplicate translations\n","df_pp = df.drop_duplicates()\n","\n","#drop empty lines (alp)\n","df_pp['source_sentence'].replace('', np.nan, inplace=True)\n","df_pp['target_sentence'].replace('', np.nan, inplace=True)\n","df_pp.dropna(subset=['source_sentence'], inplace=True)\n","df_pp.dropna(subset=['target_sentence'], inplace=True)\n","\n","# drop conflicting translations\n","df_pp.drop_duplicates(subset='source_sentence', inplace=True)\n","df_pp.drop_duplicates(subset='target_sentence', inplace=True)\n","\n","# Shuffle the data to remove bias in dev set selection.\n","df_pp = df_pp.sample(frac=1, random_state=seed).reset_index(drop=True)"],"execution_count":0,"outputs":[{"output_type":"stream","text":["/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py:6786: SettingWithCopyWarning: \n","A value is trying to be set on a copy of a slice from a DataFrame\n","\n","See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n"," self._update_inplace(new_data)\n","/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:8: SettingWithCopyWarning: \n","A value is trying to be set on a copy of a slice from a DataFrame\n","\n","See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n"," \n","/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:9: SettingWithCopyWarning: \n","A value is trying to be set on a copy of a slice from a DataFrame\n","\n","See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n"," if __name__ == '__main__':\n","/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:12: SettingWithCopyWarning: \n","A value is trying to be set on a copy of a slice from a DataFrame\n","\n","See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n"," if sys.path[0] == '':\n","/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:13: SettingWithCopyWarning: \n","A value is trying to be set on a copy of a slice from a DataFrame\n","\n","See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n"," del sys.path[0]\n"],"name":"stderr"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"hxxBOCA-xXhy","outputId":"1fb5a3e1-a775-436b-e05b-60be869dc25a","executionInfo":{"status":"ok","timestamp":1575229838616,"user_tz":-120,"elapsed":449634,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":819}},"source":["# This section does the split between train/dev for the parallel corpora then saves them as separate files\n","# We use 1000 dev test and the given test set.\n","import csv\n","\n","# Do the split between dev/train and create parallel corpora\n","num_dev_patterns = 1000\n","\n","# Optional: lower case the corpora - this will make it easier to generalize, but without proper casing.\n","if lc: # Julia: making lowercasing optional\n"," df_pp[\"source_sentence\"] = df_pp[\"source_sentence\"].str.lower()\n"," df_pp[\"target_sentence\"] = df_pp[\"target_sentence\"].str.lower()\n","\n","# Julia: test sets are already generated\n","dev = df_pp.tail(num_dev_patterns) # Herman: Error in original\n","stripped = df_pp.drop(df_pp.tail(num_dev_patterns).index)\n","\n","with open(\"train.\"+source_language, \"w\") as src_file, open(\"train.\"+target_language, \"w\") as trg_file:\n"," for index, row in stripped.iterrows():\n"," src_file.write(row[\"source_sentence\"]+\"\\n\")\n"," trg_file.write(row[\"target_sentence\"]+\"\\n\")\n"," \n","with open(\"dev.\"+source_language, \"w\") as src_file, open(\"dev.\"+target_language, \"w\") as trg_file:\n"," for index, row in dev.iterrows():\n"," src_file.write(row[\"source_sentence\"]+\"\\n\")\n"," trg_file.write(row[\"target_sentence\"]+\"\\n\")\n","\n","#stripped[[\"source_sentence\"]].to_csv(\"train.\"+source_language, header=False, index=False) # Herman: Added `header=False` everywhere\n","#stripped[[\"target_sentence\"]].to_csv(\"train.\"+target_language, header=False, index=False) # Julia: Problematic handling of quotation marks.\n","\n","#dev[[\"source_sentence\"]].to_csv(\"dev.\"+source_language, header=False, index=False)\n","#dev[[\"target_sentence\"]].to_csv(\"dev.\"+target_language, header=False, index=False)\n","\n","\n","# Doublecheck the format below. There should be no extra quotation marks or weird characters.\n","! head train.*\n","! head dev.*"],"execution_count":0,"outputs":[{"output_type":"stream","text":["==> train.en <==\n","Pleased that the king asked for wisdom rather than for riches and glory , God gave Solomon “ a wise and understanding heart ” ​ — as well as prosperity .\n","Of what can we be sure if we keep listening to Jehovah ?\n","As one lecturer said : “ The rising education level has improved the talent pool such that followers have become so critical that they are almost impossible to lead . ”\n","In contrast with the 1523 version by Lefèvre d’Étaples , based on the Latin , this one was to be translated from the Hebrew and Greek .\n","Another man in the same country told one of Jehovah’s Witnesses that he liked the Watchtower magazine .\n","While it is true that natural phenomena may have been associated with some miracles ​ — such things as earthquakes , plagues , and landslides — ​ these explanations have one thing in common .\n","Yet , she spoke tactfully to Eli , not presuming to criticize him for his false accusation .\n","* The British journal New Scientist says that AD is “ the fourth biggest killer in the developed world after heart disease , cancer and stroke . ”\n","SEE PAGES 26 - 28 .\n","HOW good and how beneficial it is to be counseled with dignity !\n","\n","==> train.sn <==\n","Mwari akafadzwa nechikumbiro chaSoromoni akamupa “ mwoyo wakachenjera , unonzwisisa , ” uye pfuma nemukurumbira zvaainge asina kukumbira .\n","Tinogona kuva nechokwadi chei kana tikaramba tichiteerera Jehovha ?\n","Mumwe mudzidzisi akati : “ Mwero wedzidzo unowedzera wakavandudza dura ramano zvakadaro zvokuti vateveri vazova vanotsoropodza kwazvo zvokuti vanodokutova vasingabviri kutungamirira . ”\n","Kusiyana neBhaibheri ra1523 raLefèvre d’Étaples rakashandurwa kubva muchiLatin , iri raizoshandurwa kubva muchiHebheru nechiGiriki .\n","Mumwe murume ari munyika imwe cheteyo akaudza mumwe weZvapupu zvaJehovha kuti aifarira magazini yeNharireyomurindi .\n","Kunyange zvazvo chiri chokwadi kuti zvimwe zvishamiso zvakaita sekudengenyeka kwenyika , matenda , nokuondomoka kwevhu zvinhu zvagara zvichizivikanwa kuti zvinoitika , pane chinhu chakafanana pazviri zvose .\n","Asi , akataura nounyanzvi kuna Eri , kwete kuda kumupikisa nemhaka yokuti akanga amupomera zvenhema .\n","* Magazini yeNew Scientist yokuBritain inotaura kuti AD “ muurayi mukuru wechina munyika dzakabudirira ichitevera hosha yemwoyo , kenza uye sitiroko . ”\n","ONA MAPEJI 26 - 28 .\n","KWAKANAKA uye kunobetsera sei kupiwa zano nechiremera !\n","==> dev.en <==\n","In some cultures , birthstones are associated with the month of one’s birth .\n","What will help us to appreciate that God watches over us because he loves us ? Let us consider how he shows this .\n","; Keyzer , M .\n","We can give simply by sincerely asking how our friends are , by trying to understand their problems , and by doing all we can without waiting for them to ask . ”\n","It will help us never to lose sight of the fact that he is the One “ guarding all those loving him . ” ​ — Psalm 145 : 18 - 20 .\n","Belief in a Designer is compatible with true science\n","Enthusiastic about the success of this special work , 29 publishers of the Arpoador Congregation went to preach in the town of Mutum , about 300 miles [ 500 km ] away .\n","129th Graduating Class of the Watchtower Bible School of Gilead\n","Everyone wanted everyone else to succeed , ” said Richard and Lusia , describing their fellow students of the 105th class of the Watchtower Bible School of Gilead .\n","Mary — “ Highly Favored ” by God\n","\n","==> dev.sn <==\n","Mune dzimwe tsika , matombo omwedzi wawakazvarwa anonzi ane chokuita nomwedzi wakazvarwa munhu .\n","Chii chichatibatsira kunzwisisa kuti kutariswa kwatinoitwa naMwari kunoratidza kuti anotida ?\n","; Keyzer , M .\n","Tinogona kupa kana tikangobvunza nomwoyo wose shamwari dzedu kuti dzakadini , kana tikaedza kunzwisisa zvinetso zvadzo , uye kana tikaita zvose zvatinogona tisingamiriri kuti dzikumbire . ”\n","Kuchatibetsera kusatongokanganwa idi rokuti ndiye Uyo “ unochengeta vose vanomuda . ” — Pisarema 145 : 18 - 20 .\n","Kudavira kuti kune Akagadzira kunoenderana nesayenzi yechokwadi\n","Vaine mbavarira pamusoro pebudiriro yeiri basa chairo , vaparidzi 29 veArpoador Ungano vakaenda kundoparidzira mutaundi reMutum , kure namakiromita anenge 500 .\n","Kirasi Yechi129 Yechikoro cheBhaibheri cheWatchtower cheGiriyedhi Inopedza Kudzidza\n","Munhu wose aida kuti vamwe vose vabudirire , ” vakadaro Richard naLusia , vachitsanangura vavaidzidza navo vekirasi yechi105 yeWatchtower Bible School of Gilead .\n","Maria — “ Akadiwa Zvikuru Kwazvo ” naMwari\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"epeCydmCyS8X"},"source":["\n","\n","---\n","\n","\n","## Installation of JoeyNMT\n","\n","JoeyNMT is a simple, minimalist NMT package which is useful for learning and teaching. Check out the documentation for JoeyNMT [here](https://joeynmt.readthedocs.io) "]},{"cell_type":"code","metadata":{"colab_type":"code","id":"iBRMm4kMxZ8L","outputId":"60c6ae6a-c34b-4863-a426-02089e6311b8","executionInfo":{"status":"ok","timestamp":1575229853567,"user_tz":-120,"elapsed":464574,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":1000}},"source":["# Install JoeyNMT\n","! git clone https://github.com/joeynmt/joeynmt.git\n","! cd joeynmt; pip3 install ."],"execution_count":0,"outputs":[{"output_type":"stream","text":["Cloning into 'joeynmt'...\n","remote: Enumerating objects: 15, done.\u001b[K\n","remote: Counting objects: 100% (15/15), done.\u001b[K\n","remote: Compressing objects: 100% (12/12), done.\u001b[K\n","remote: Total 2199 (delta 4), reused 5 (delta 3), pack-reused 2184\u001b[K\n","Receiving objects: 100% (2199/2199), 2.60 MiB | 4.30 MiB/s, done.\n","Resolving deltas: 100% (1525/1525), done.\n","Processing /content/joeynmt\n","Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (0.16.0)\n","Requirement already satisfied: pillow in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (4.3.0)\n","Requirement already satisfied: numpy<2.0,>=1.14.5 in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (1.17.4)\n","Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (41.6.0)\n","Requirement already satisfied: torch>=1.1 in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (1.3.1)\n","Requirement already satisfied: tensorflow>=1.14 in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (1.15.0)\n","Requirement already satisfied: torchtext in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (0.3.1)\n","Collecting sacrebleu>=1.3.6\n"," Downloading https://files.pythonhosted.org/packages/0e/e5/93d252182f7cbd4b59bb3ec5797e2ce33cfd6f5aadaf327db170cf4b7887/sacrebleu-1.4.2-py3-none-any.whl\n","Collecting subword-nmt\n"," Downloading https://files.pythonhosted.org/packages/74/60/6600a7bc09e7ab38bc53a48a20d8cae49b837f93f5842a41fe513a694912/subword_nmt-0.3.7-py2.py3-none-any.whl\n","Requirement already satisfied: matplotlib in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (3.1.1)\n","Requirement already satisfied: seaborn in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (0.9.0)\n","Collecting pyyaml>=5.1\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/e3/e8/b3212641ee2718d556df0f23f78de8303f068fe29cdaa7a91018849582fe/PyYAML-5.1.2.tar.gz (265kB)\n","\u001b[K |████████████████████████████████| 266kB 18.0MB/s \n","\u001b[?25hCollecting pylint\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/e9/59/43fc36c5ee316bb9aeb7cf5329cdbdca89e5749c34d5602753827c0aa2dc/pylint-2.4.4-py3-none-any.whl (302kB)\n","\u001b[K |████████████████████████████████| 307kB 49.6MB/s \n","\u001b[?25hRequirement already satisfied: six==1.12 in /usr/local/lib/python3.6/dist-packages (from joeynmt==0.0.1) (1.12.0)\n","Requirement already satisfied: olefile in /usr/local/lib/python3.6/dist-packages (from pillow->joeynmt==0.0.1) (0.46)\n","Requirement already satisfied: google-pasta>=0.1.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (0.1.8)\n","Requirement already satisfied: gast==0.2.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (0.2.2)\n","Requirement already satisfied: tensorboard<1.16.0,>=1.15.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (1.15.0)\n","Requirement already satisfied: astor>=0.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (0.8.0)\n","Requirement already satisfied: keras-preprocessing>=1.0.5 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (1.1.0)\n","Requirement already satisfied: grpcio>=1.8.6 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (1.15.0)\n","Requirement already satisfied: keras-applications>=1.0.8 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (1.0.8)\n","Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (0.33.6)\n","Requirement already satisfied: protobuf>=3.6.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (3.10.0)\n","Requirement already satisfied: wrapt>=1.11.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (1.11.2)\n","Requirement already satisfied: opt-einsum>=2.3.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (3.1.0)\n","Requirement already satisfied: termcolor>=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (1.1.0)\n","Requirement already satisfied: absl-py>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (0.8.1)\n","Requirement already satisfied: tensorflow-estimator==1.15.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=1.14->joeynmt==0.0.1) (1.15.1)\n","Requirement already satisfied: requests in /usr/local/lib/python3.6/dist-packages (from torchtext->joeynmt==0.0.1) (2.21.0)\n","Requirement already satisfied: tqdm in /usr/local/lib/python3.6/dist-packages (from torchtext->joeynmt==0.0.1) (4.28.1)\n","Collecting portalocker\n"," Downloading https://files.pythonhosted.org/packages/91/db/7bc703c0760df726839e0699b7f78a4d8217fdc9c7fcb1b51b39c5a22a4e/portalocker-1.5.2-py2.py3-none-any.whl\n","Requirement already satisfied: typing in /usr/local/lib/python3.6/dist-packages (from sacrebleu>=1.3.6->joeynmt==0.0.1) (3.6.6)\n","Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->joeynmt==0.0.1) (1.1.0)\n","Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.6/dist-packages (from matplotlib->joeynmt==0.0.1) (0.10.0)\n","Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->joeynmt==0.0.1) (2.6.1)\n","Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.6/dist-packages (from matplotlib->joeynmt==0.0.1) (2.4.5)\n","Requirement already satisfied: pandas>=0.15.2 in /usr/local/lib/python3.6/dist-packages (from seaborn->joeynmt==0.0.1) (0.25.3)\n","Requirement already satisfied: scipy>=0.14.0 in /usr/local/lib/python3.6/dist-packages (from seaborn->joeynmt==0.0.1) (1.3.2)\n","Collecting astroid<2.4,>=2.3.0\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/ad/ae/86734823047962e7b8c8529186a1ac4a7ca19aaf1aa0c7713c022ef593fd/astroid-2.3.3-py3-none-any.whl (205kB)\n","\u001b[K |████████████████████████████████| 215kB 49.1MB/s \n","\u001b[?25hCollecting isort<5,>=4.2.5\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/e5/b0/c121fd1fa3419ea9bfd55c7f9c4fedfec5143208d8c7ad3ce3db6c623c21/isort-4.3.21-py2.py3-none-any.whl (42kB)\n","\u001b[K |████████████████████████████████| 51kB 8.0MB/s \n","\u001b[?25hCollecting mccabe<0.7,>=0.6\n"," Downloading https://files.pythonhosted.org/packages/87/89/479dc97e18549e21354893e4ee4ef36db1d237534982482c3681ee6e7b57/mccabe-0.6.1-py2.py3-none-any.whl\n","Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.16.0,>=1.15.0->tensorflow>=1.14->joeynmt==0.0.1) (3.1.1)\n","Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<1.16.0,>=1.15.0->tensorflow>=1.14->joeynmt==0.0.1) (0.16.0)\n","Requirement already satisfied: h5py in /usr/local/lib/python3.6/dist-packages (from keras-applications>=1.0.8->tensorflow>=1.14->joeynmt==0.0.1) (2.8.0)\n","Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests->torchtext->joeynmt==0.0.1) (1.24.3)\n","Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests->torchtext->joeynmt==0.0.1) (2019.9.11)\n","Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests->torchtext->joeynmt==0.0.1) (2.8)\n","Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests->torchtext->joeynmt==0.0.1) (3.0.4)\n","Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.6/dist-packages (from pandas>=0.15.2->seaborn->joeynmt==0.0.1) (2018.9)\n","Collecting typed-ast<1.5,>=1.4.0; implementation_name == \"cpython\" and python_version < \"3.8\"\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/31/d3/9d1802c161626d0278bafb1ffb32f76b9d01e123881bbf9d91e8ccf28e18/typed_ast-1.4.0-cp36-cp36m-manylinux1_x86_64.whl (736kB)\n","\u001b[K |████████████████████████████████| 737kB 45.1MB/s \n","\u001b[?25hCollecting lazy-object-proxy==1.4.*\n","\u001b[?25l Downloading https://files.pythonhosted.org/packages/0b/dd/b1e3407e9e6913cf178e506cd0dee818e58694d9a5cd1984e3f6a8b9a10f/lazy_object_proxy-1.4.3-cp36-cp36m-manylinux1_x86_64.whl (55kB)\n","\u001b[K |████████████████████████████████| 61kB 8.6MB/s \n","\u001b[?25hBuilding wheels for collected packages: joeynmt, pyyaml\n"," Building wheel for joeynmt (setup.py) ... \u001b[?25l\u001b[?25hdone\n"," Created wheel for joeynmt: filename=joeynmt-0.0.1-cp36-none-any.whl size=72136 sha256=8d1da7c50128bec0ca86ac95276eed04fd1d1ccee13dc54c1207fb3f657227d4\n"," Stored in directory: /tmp/pip-ephem-wheel-cache-zz_tdx8z/wheels/db/01/db/751cc9f3e7f6faec127c43644ba250a3ea7ad200594aeda70a\n"," Building wheel for pyyaml (setup.py) ... \u001b[?25l\u001b[?25hdone\n"," Created wheel for pyyaml: filename=PyYAML-5.1.2-cp36-cp36m-linux_x86_64.whl size=44104 sha256=a638e50e9512e30fe0b56a21d7438f05259a5f959ce2abc36f7164b12f5f9cc1\n"," Stored in directory: /root/.cache/pip/wheels/d9/45/dd/65f0b38450c47cf7e5312883deb97d065e030c5cca0a365030\n","Successfully built joeynmt pyyaml\n","Installing collected packages: portalocker, sacrebleu, subword-nmt, pyyaml, typed-ast, lazy-object-proxy, astroid, isort, mccabe, pylint, joeynmt\n"," Found existing installation: PyYAML 3.13\n"," Uninstalling PyYAML-3.13:\n"," Successfully uninstalled PyYAML-3.13\n","Successfully installed astroid-2.3.3 isort-4.3.21 joeynmt-0.0.1 lazy-object-proxy-1.4.3 mccabe-0.6.1 portalocker-1.5.2 pylint-2.4.4 pyyaml-5.1.2 sacrebleu-1.4.2 subword-nmt-0.3.7 typed-ast-1.4.0\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"AaE77Tcppex9"},"source":["# Preprocessing the Data into Subword BPE Tokens\n","\n","- One of the most powerful improvements for agglutinative languages (a feature of most Bantu languages) is using BPE tokenization [ (Sennrich, 2015) ](https://arxiv.org/abs/1508.07909).\n","\n","- It was also shown that by optimizing the umber of BPE codes we significantly improve results for low-resourced languages [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021) [(Martinus, 2019)](https://arxiv.org/abs/1906.05685)\n","\n","- Below we have the scripts for doing BPE tokenization of our data. We use 4000 tokens as recommended by [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021). You do not need to change anything. Simply running the below will be suitable. "]},{"cell_type":"code","metadata":{"colab_type":"code","id":"H-TyjtmXB1mL","outputId":"1095af87-d66a-47a1-c057-51f0d7e220e3","executionInfo":{"status":"ok","timestamp":1575230237352,"user_tz":-120,"elapsed":848343,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":428}},"source":["# One of the huge boosts in NMT performance was to use a different method of tokenizing. \n","# Usually, NMT would tokenize by words. However, using a method called BPE gave amazing boosts to performance\n","\n","# Do subword NMT\n","from os import path\n","os.environ[\"src\"] = source_language # Sets them in bash as well, since we often use bash scripts\n","os.environ[\"tgt\"] = target_language\n","\n","# Learn BPEs on the training data.\n","os.environ[\"data_path\"] = path.join(\"joeynmt\", \"data\", source_language + target_language) # Herman! \n","! subword-nmt learn-joint-bpe-and-vocab --input train.$src train.$tgt -s 4000 -o bpe.codes.4000 --write-vocabulary vocab.$src vocab.$tgt\n","\n","# Apply BPE splits to the development and test data.\n","! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < train.$src > train.bpe.$src\n","! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < train.$tgt > train.bpe.$tgt\n","\n","! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < dev.$src > dev.bpe.$src\n","! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < dev.$tgt > dev.bpe.$tgt\n","! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$src < test.$src > test.bpe.$src\n","! subword-nmt apply-bpe -c bpe.codes.4000 --vocabulary vocab.$tgt < test.$tgt > test.bpe.$tgt\n","\n","# Create directory, move everyone we care about to the correct location\n","! mkdir -p $data_path\n","! cp train.* $data_path\n","! cp test.* $data_path\n","! cp dev.* $data_path\n","! cp bpe.codes.4000 $data_path\n","! ls $data_path\n","\n","# Also move everything we care about to a mounted location in google drive (relevant if running in colab) at gdrive_path\n","! cp train.* \"$gdrive_path\"\n","! cp test.* \"$gdrive_path\"\n","! cp dev.* \"$gdrive_path\"\n","! cp bpe.codes.4000 \"$gdrive_path\"\n","! ls \"$gdrive_path\"\n","\n","# Create that vocab using build_vocab\n","! sudo chmod 777 joeynmt/scripts/build_vocab.py\n","! joeynmt/scripts/build_vocab.py joeynmt/data/$src$tgt/train.bpe.$src joeynmt/data/$src$tgt/train.bpe.$tgt --output_path joeynmt/data/$src$tgt/vocab.txt\n","\n","# Some output\n","! echo \"BPE Shona Sentences\"\n","! tail -n 5 test.bpe.$tgt\n","! echo \"Combined BPE Vocab\"\n","! tail -n 10 joeynmt/data/$src$tgt/vocab.txt # Herman"],"execution_count":0,"outputs":[{"output_type":"stream","text":["bpe.codes.4000\tdev.en\t test.bpe.sn test.sn\t train.en\n","dev.bpe.en\tdev.sn\t test.en\t train.bpe.en train.sn\n","dev.bpe.sn\ttest.bpe.en test.en-any.en train.bpe.sn\n","bpe.codes.4000\tdev.en\ttest.bpe.en test.en-any.en train.bpe.sn\n","dev.bpe.en\tdev.sn\ttest.bpe.sn test.sn\t train.en\n","dev.bpe.sn\tmodels\ttest.en train.bpe.en train.sn\n","BPE Shona Sentences\n","N@@ ho@@ o huru yo@@ kutenda ( Ona ndima 12 - 14 )\n","N@@ go@@ wan@@ i yor@@ up@@ on@@ eso ( Ona ndima 15 - 18 )\n","Ndaka@@ ona kuti vanhu vano@@ wanz@@ ot@@ eerera kana vaka@@ ona kuti un@@ ony@@ ats@@ ot@@ aura zviri muBhaibheri ne@@ chi@@ do uye kuti uri ku@@ edza zv@@ ese zva@@ unogona kuti u@@ vab@@ ats@@ ire . ”\n","B@@ akat@@ wa rem@@ weya ( Ona ndima 19 - 20 )\n","T@@ ich@@ ib@@ atsirwa naJehovha tinogona kum@@ ira t@@ akasimba pa@@ kur@@ wisana naye .\n","Combined BPE Vocab\n","❍\n","α\n","ι\n","muchira\n","›\n","Ā@@\n","▲\n","◀\n","̀@@\n",";@@\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"IlMitUHR8Qy-","outputId":"e089dfda-6365-4a1f-a259-4d4d49bd0780","executionInfo":{"status":"ok","timestamp":1575230254334,"user_tz":-120,"elapsed":865313,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":68}},"source":["# Also move everything we care about to a mounted location in google drive (relevant if running in colab) at gdrive_path\n","! cp train.* \"$gdrive_path\"\n","! cp test.* \"$gdrive_path\"\n","! cp dev.* \"$gdrive_path\"\n","! cp bpe.codes.4000 \"$gdrive_path\"\n","! ls \"$gdrive_path\""],"execution_count":0,"outputs":[{"output_type":"stream","text":["bpe.codes.4000\tdev.en\ttest.bpe.en test.en-any.en train.bpe.sn\n","dev.bpe.en\tdev.sn\ttest.bpe.sn test.sn\t train.en\n","dev.bpe.sn\tmodels\ttest.en train.bpe.en train.sn\n"],"name":"stdout"}]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"Ixmzi60WsUZ8"},"source":["# Creating the JoeyNMT Config\n","\n","JoeyNMT requires a yaml config. We provide a template below. We've also set a number of defaults with it, that you may play with!\n","\n","- We used Transformer architecture \n","- We set our dropout to reasonably high: 0.3 (recommended in [(Sennrich, 2019)](https://www.aclweb.org/anthology/P19-1021))\n","\n","Things worth playing with:\n","- The batch size (also recommended to change for low-resourced languages)\n","- The number of epochs (we've set it at 30 just so it runs in about an hour, for testing purposes)\n","- The decoder options (beam_size, alpha)\n","- Evaluation metrics (BLEU versus Crhf4)"]},{"cell_type":"code","metadata":{"colab_type":"code","id":"PIs1lY2hxMsl","colab":{}},"source":["# This creates the config file for our JoeyNMT system. It might seem overwhelming so we've provided a couple of useful parameters you'll need to update\n","# (You can of course play with all the parameters if you'd like!)\n","\n","name = '%s%s' % (source_language, target_language)\n","gdrive_path = os.environ[\"gdrive_path\"]\n","\n","# Create the config\n","config = \"\"\"\n","name: \"{name}_transformer\"\n","\n","data:\n"," src: \"{source_language}\"\n"," trg: \"{target_language}\"\n"," train: \"data/{name}/train.bpe\"\n"," dev: \"data/{name}/dev.bpe\"\n"," test: \"data/{name}/test.bpe\"\n"," level: \"bpe\"\n"," lowercase: False\n"," max_sent_length: 100\n"," src_vocab: \"data/{name}/vocab.txt\"\n"," trg_vocab: \"data/{name}/vocab.txt\"\n","\n","testing:\n"," beam_size: 5\n"," alpha: 1.0\n","\n","training:\n"," load_model: \"{gdrive_path}/models/{name}_transformer/22000.ckpt\" # if uncommented, load a pre-trained model from this checkpoint\n"," random_seed: 42\n"," optimizer: \"adam\"\n"," normalization: \"tokens\"\n"," adam_betas: [0.9, 0.999] \n"," scheduling: \"plateau\" # TODO: try switching from plateau to Noam scheduling\n"," patience: 5 # For plateau: decrease learning rate by decrease_factor if validation score has not improved for this many validation rounds.\n"," learning_rate_factor: 0.5 # factor for Noam scheduler (used with Transformer)\n"," learning_rate_warmup: 1000 # warmup steps for Noam scheduler (used with Transformer)\n"," decrease_factor: 0.7\n"," loss: \"crossentropy\"\n"," learning_rate: 0.0003\n"," learning_rate_min: 0.00000001\n"," weight_decay: 0.0\n"," label_smoothing: 0.1\n"," batch_size: 4096\n"," batch_type: \"token\"\n"," eval_batch_size: 3600\n"," eval_batch_type: \"token\"\n"," batch_multiplier: 1\n"," early_stopping_metric: \"ppl\"\n"," epochs: 30 # TODO: Decrease for when playing around and checking of working. Around 30 is sufficient to check if its working at all\n"," validation_freq: 1000 # TODO: Set to at least once per epoch.\n"," logging_freq: 100\n"," eval_metric: \"bleu\"\n"," model_dir: \"models/{name}_transformer\"\n"," overwrite: False # TODO: Set to True if you want to overwrite possibly existing models. \n"," shuffle: True\n"," use_cuda: True\n"," max_output_length: 100\n"," print_valid_sents: [0, 1, 2, 3]\n"," keep_last_ckpts: 3\n","\n","model:\n"," initializer: \"xavier\"\n"," bias_initializer: \"zeros\"\n"," init_gain: 1.0\n"," embed_initializer: \"xavier\"\n"," embed_init_gain: 1.0\n"," tied_embeddings: True\n"," tied_softmax: True\n"," encoder:\n"," type: \"transformer\"\n"," num_layers: 6\n"," num_heads: 4 # TODO: Increase to 8 for larger data.\n"," embeddings:\n"," embedding_dim: 256 # TODO: Increase to 512 for larger data.\n"," scale: True\n"," dropout: 0.2\n"," # typically ff_size = 4 x hidden_size\n"," hidden_size: 256 # TODO: Increase to 512 for larger data.\n"," ff_size: 1024 # TODO: Increase to 2048 for larger data.\n"," dropout: 0.3\n"," decoder:\n"," type: \"transformer\"\n"," num_layers: 6\n"," num_heads: 4 # TODO: Increase to 8 for larger data.\n"," embeddings:\n"," embedding_dim: 256 # TODO: Increase to 512 for larger data.\n"," scale: True\n"," dropout: 0.2\n"," # typically ff_size = 4 x hidden_size\n"," hidden_size: 256 # TODO: Increase to 512 for larger data.\n"," ff_size: 1024 # TODO: Increase to 2048 for larger data.\n"," dropout: 0.3\n","\"\"\".format(name=name, gdrive_path=os.environ[\"gdrive_path\"], source_language=source_language, target_language=target_language)\n","with open(\"joeynmt/configs/transformer_{name}.yaml\".format(name=name),'w') as f:\n"," f.write(config)"],"execution_count":0,"outputs":[]},{"cell_type":"markdown","metadata":{"colab_type":"text","id":"pIifxE3Qzuvs"},"source":["# Train the Model\n","\n","This single line of joeynmt runs the training using the config we made above"]},{"cell_type":"code","metadata":{"colab_type":"code","id":"6ZBPFwT94WpI","outputId":"6a3eaa0d-3e14-4bcf-ebf7-5232baa0fd58","executionInfo":{"status":"ok","timestamp":1575232969701,"user_tz":-120,"elapsed":3580646,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":1000}},"source":["# Train the model\n","# You can press Ctrl-C to stop. And then run the next cell to save your checkpoints! \n","!cd joeynmt; python3 -m joeynmt train configs/transformer_$src$tgt.yaml"],"execution_count":0,"outputs":[{"output_type":"stream","text":["2019-12-01 19:57:55,436 Hello! This is Joey-NMT.\n","2019-12-01 19:57:56,938 Total params: 12196608\n","2019-12-01 19:57:56,940 Trainable parameters: ['decoder.layer_norm.bias', 'decoder.layer_norm.weight', 'decoder.layers.0.dec_layer_norm.bias', 'decoder.layers.0.dec_layer_norm.weight', 'decoder.layers.0.feed_forward.layer_norm.bias', 'decoder.layers.0.feed_forward.layer_norm.weight', 'decoder.layers.0.feed_forward.pwff_layer.0.bias', 'decoder.layers.0.feed_forward.pwff_layer.0.weight', 'decoder.layers.0.feed_forward.pwff_layer.3.bias', 'decoder.layers.0.feed_forward.pwff_layer.3.weight', 'decoder.layers.0.src_trg_att.k_layer.bias', 'decoder.layers.0.src_trg_att.k_layer.weight', 'decoder.layers.0.src_trg_att.output_layer.bias', 'decoder.layers.0.src_trg_att.output_layer.weight', 'decoder.layers.0.src_trg_att.q_layer.bias', 'decoder.layers.0.src_trg_att.q_layer.weight', 'decoder.layers.0.src_trg_att.v_layer.bias', 'decoder.layers.0.src_trg_att.v_layer.weight', 'decoder.layers.0.trg_trg_att.k_layer.bias', 'decoder.layers.0.trg_trg_att.k_layer.weight', 'decoder.layers.0.trg_trg_att.output_layer.bias', 'decoder.layers.0.trg_trg_att.output_layer.weight', 'decoder.layers.0.trg_trg_att.q_layer.bias', 'decoder.layers.0.trg_trg_att.q_layer.weight', 'decoder.layers.0.trg_trg_att.v_layer.bias', 'decoder.layers.0.trg_trg_att.v_layer.weight', 'decoder.layers.0.x_layer_norm.bias', 'decoder.layers.0.x_layer_norm.weight', 'decoder.layers.1.dec_layer_norm.bias', 'decoder.layers.1.dec_layer_norm.weight', 'decoder.layers.1.feed_forward.layer_norm.bias', 'decoder.layers.1.feed_forward.layer_norm.weight', 'decoder.layers.1.feed_forward.pwff_layer.0.bias', 'decoder.layers.1.feed_forward.pwff_layer.0.weight', 'decoder.layers.1.feed_forward.pwff_layer.3.bias', 'decoder.layers.1.feed_forward.pwff_layer.3.weight', 'decoder.layers.1.src_trg_att.k_layer.bias', 'decoder.layers.1.src_trg_att.k_layer.weight', 'decoder.layers.1.src_trg_att.output_layer.bias', 'decoder.layers.1.src_trg_att.output_layer.weight', 'decoder.layers.1.src_trg_att.q_layer.bias', 'decoder.layers.1.src_trg_att.q_layer.weight', 'decoder.layers.1.src_trg_att.v_layer.bias', 'decoder.layers.1.src_trg_att.v_layer.weight', 'decoder.layers.1.trg_trg_att.k_layer.bias', 'decoder.layers.1.trg_trg_att.k_layer.weight', 'decoder.layers.1.trg_trg_att.output_layer.bias', 'decoder.layers.1.trg_trg_att.output_layer.weight', 'decoder.layers.1.trg_trg_att.q_layer.bias', 'decoder.layers.1.trg_trg_att.q_layer.weight', 'decoder.layers.1.trg_trg_att.v_layer.bias', 'decoder.layers.1.trg_trg_att.v_layer.weight', 'decoder.layers.1.x_layer_norm.bias', 'decoder.layers.1.x_layer_norm.weight', 'decoder.layers.2.dec_layer_norm.bias', 'decoder.layers.2.dec_layer_norm.weight', 'decoder.layers.2.feed_forward.layer_norm.bias', 'decoder.layers.2.feed_forward.layer_norm.weight', 'decoder.layers.2.feed_forward.pwff_layer.0.bias', 'decoder.layers.2.feed_forward.pwff_layer.0.weight', 'decoder.layers.2.feed_forward.pwff_layer.3.bias', 'decoder.layers.2.feed_forward.pwff_layer.3.weight', 'decoder.layers.2.src_trg_att.k_layer.bias', 'decoder.layers.2.src_trg_att.k_layer.weight', 'decoder.layers.2.src_trg_att.output_layer.bias', 'decoder.layers.2.src_trg_att.output_layer.weight', 'decoder.layers.2.src_trg_att.q_layer.bias', 'decoder.layers.2.src_trg_att.q_layer.weight', 'decoder.layers.2.src_trg_att.v_layer.bias', 'decoder.layers.2.src_trg_att.v_layer.weight', 'decoder.layers.2.trg_trg_att.k_layer.bias', 'decoder.layers.2.trg_trg_att.k_layer.weight', 'decoder.layers.2.trg_trg_att.output_layer.bias', 'decoder.layers.2.trg_trg_att.output_layer.weight', 'decoder.layers.2.trg_trg_att.q_layer.bias', 'decoder.layers.2.trg_trg_att.q_layer.weight', 'decoder.layers.2.trg_trg_att.v_layer.bias', 'decoder.layers.2.trg_trg_att.v_layer.weight', 'decoder.layers.2.x_layer_norm.bias', 'decoder.layers.2.x_layer_norm.weight', 'decoder.layers.3.dec_layer_norm.bias', 'decoder.layers.3.dec_layer_norm.weight', 'decoder.layers.3.feed_forward.layer_norm.bias', 'decoder.layers.3.feed_forward.layer_norm.weight', 'decoder.layers.3.feed_forward.pwff_layer.0.bias', 'decoder.layers.3.feed_forward.pwff_layer.0.weight', 'decoder.layers.3.feed_forward.pwff_layer.3.bias', 'decoder.layers.3.feed_forward.pwff_layer.3.weight', 'decoder.layers.3.src_trg_att.k_layer.bias', 'decoder.layers.3.src_trg_att.k_layer.weight', 'decoder.layers.3.src_trg_att.output_layer.bias', 'decoder.layers.3.src_trg_att.output_layer.weight', 'decoder.layers.3.src_trg_att.q_layer.bias', 'decoder.layers.3.src_trg_att.q_layer.weight', 'decoder.layers.3.src_trg_att.v_layer.bias', 'decoder.layers.3.src_trg_att.v_layer.weight', 'decoder.layers.3.trg_trg_att.k_layer.bias', 'decoder.layers.3.trg_trg_att.k_layer.weight', 'decoder.layers.3.trg_trg_att.output_layer.bias', 'decoder.layers.3.trg_trg_att.output_layer.weight', 'decoder.layers.3.trg_trg_att.q_layer.bias', 'decoder.layers.3.trg_trg_att.q_layer.weight', 'decoder.layers.3.trg_trg_att.v_layer.bias', 'decoder.layers.3.trg_trg_att.v_layer.weight', 'decoder.layers.3.x_layer_norm.bias', 'decoder.layers.3.x_layer_norm.weight', 'decoder.layers.4.dec_layer_norm.bias', 'decoder.layers.4.dec_layer_norm.weight', 'decoder.layers.4.feed_forward.layer_norm.bias', 'decoder.layers.4.feed_forward.layer_norm.weight', 'decoder.layers.4.feed_forward.pwff_layer.0.bias', 'decoder.layers.4.feed_forward.pwff_layer.0.weight', 'decoder.layers.4.feed_forward.pwff_layer.3.bias', 'decoder.layers.4.feed_forward.pwff_layer.3.weight', 'decoder.layers.4.src_trg_att.k_layer.bias', 'decoder.layers.4.src_trg_att.k_layer.weight', 'decoder.layers.4.src_trg_att.output_layer.bias', 'decoder.layers.4.src_trg_att.output_layer.weight', 'decoder.layers.4.src_trg_att.q_layer.bias', 'decoder.layers.4.src_trg_att.q_layer.weight', 'decoder.layers.4.src_trg_att.v_layer.bias', 'decoder.layers.4.src_trg_att.v_layer.weight', 'decoder.layers.4.trg_trg_att.k_layer.bias', 'decoder.layers.4.trg_trg_att.k_layer.weight', 'decoder.layers.4.trg_trg_att.output_layer.bias', 'decoder.layers.4.trg_trg_att.output_layer.weight', 'decoder.layers.4.trg_trg_att.q_layer.bias', 'decoder.layers.4.trg_trg_att.q_layer.weight', 'decoder.layers.4.trg_trg_att.v_layer.bias', 'decoder.layers.4.trg_trg_att.v_layer.weight', 'decoder.layers.4.x_layer_norm.bias', 'decoder.layers.4.x_layer_norm.weight', 'decoder.layers.5.dec_layer_norm.bias', 'decoder.layers.5.dec_layer_norm.weight', 'decoder.layers.5.feed_forward.layer_norm.bias', 'decoder.layers.5.feed_forward.layer_norm.weight', 'decoder.layers.5.feed_forward.pwff_layer.0.bias', 'decoder.layers.5.feed_forward.pwff_layer.0.weight', 'decoder.layers.5.feed_forward.pwff_layer.3.bias', 'decoder.layers.5.feed_forward.pwff_layer.3.weight', 'decoder.layers.5.src_trg_att.k_layer.bias', 'decoder.layers.5.src_trg_att.k_layer.weight', 'decoder.layers.5.src_trg_att.output_layer.bias', 'decoder.layers.5.src_trg_att.output_layer.weight', 'decoder.layers.5.src_trg_att.q_layer.bias', 'decoder.layers.5.src_trg_att.q_layer.weight', 'decoder.layers.5.src_trg_att.v_layer.bias', 'decoder.layers.5.src_trg_att.v_layer.weight', 'decoder.layers.5.trg_trg_att.k_layer.bias', 'decoder.layers.5.trg_trg_att.k_layer.weight', 'decoder.layers.5.trg_trg_att.output_layer.bias', 'decoder.layers.5.trg_trg_att.output_layer.weight', 'decoder.layers.5.trg_trg_att.q_layer.bias', 'decoder.layers.5.trg_trg_att.q_layer.weight', 'decoder.layers.5.trg_trg_att.v_layer.bias', 'decoder.layers.5.trg_trg_att.v_layer.weight', 'decoder.layers.5.x_layer_norm.bias', 'decoder.layers.5.x_layer_norm.weight', 'encoder.layer_norm.bias', 'encoder.layer_norm.weight', 'encoder.layers.0.feed_forward.layer_norm.bias', 'encoder.layers.0.feed_forward.layer_norm.weight', 'encoder.layers.0.feed_forward.pwff_layer.0.bias', 'encoder.layers.0.feed_forward.pwff_layer.0.weight', 'encoder.layers.0.feed_forward.pwff_layer.3.bias', 'encoder.layers.0.feed_forward.pwff_layer.3.weight', 'encoder.layers.0.layer_norm.bias', 'encoder.layers.0.layer_norm.weight', 'encoder.layers.0.src_src_att.k_layer.bias', 'encoder.layers.0.src_src_att.k_layer.weight', 'encoder.layers.0.src_src_att.output_layer.bias', 'encoder.layers.0.src_src_att.output_layer.weight', 'encoder.layers.0.src_src_att.q_layer.bias', 'encoder.layers.0.src_src_att.q_layer.weight', 'encoder.layers.0.src_src_att.v_layer.bias', 'encoder.layers.0.src_src_att.v_layer.weight', 'encoder.layers.1.feed_forward.layer_norm.bias', 'encoder.layers.1.feed_forward.layer_norm.weight', 'encoder.layers.1.feed_forward.pwff_layer.0.bias', 'encoder.layers.1.feed_forward.pwff_layer.0.weight', 'encoder.layers.1.feed_forward.pwff_layer.3.bias', 'encoder.layers.1.feed_forward.pwff_layer.3.weight', 'encoder.layers.1.layer_norm.bias', 'encoder.layers.1.layer_norm.weight', 'encoder.layers.1.src_src_att.k_layer.bias', 'encoder.layers.1.src_src_att.k_layer.weight', 'encoder.layers.1.src_src_att.output_layer.bias', 'encoder.layers.1.src_src_att.output_layer.weight', 'encoder.layers.1.src_src_att.q_layer.bias', 'encoder.layers.1.src_src_att.q_layer.weight', 'encoder.layers.1.src_src_att.v_layer.bias', 'encoder.layers.1.src_src_att.v_layer.weight', 'encoder.layers.2.feed_forward.layer_norm.bias', 'encoder.layers.2.feed_forward.layer_norm.weight', 'encoder.layers.2.feed_forward.pwff_layer.0.bias', 'encoder.layers.2.feed_forward.pwff_layer.0.weight', 'encoder.layers.2.feed_forward.pwff_layer.3.bias', 'encoder.layers.2.feed_forward.pwff_layer.3.weight', 'encoder.layers.2.layer_norm.bias', 'encoder.layers.2.layer_norm.weight', 'encoder.layers.2.src_src_att.k_layer.bias', 'encoder.layers.2.src_src_att.k_layer.weight', 'encoder.layers.2.src_src_att.output_layer.bias', 'encoder.layers.2.src_src_att.output_layer.weight', 'encoder.layers.2.src_src_att.q_layer.bias', 'encoder.layers.2.src_src_att.q_layer.weight', 'encoder.layers.2.src_src_att.v_layer.bias', 'encoder.layers.2.src_src_att.v_layer.weight', 'encoder.layers.3.feed_forward.layer_norm.bias', 'encoder.layers.3.feed_forward.layer_norm.weight', 'encoder.layers.3.feed_forward.pwff_layer.0.bias', 'encoder.layers.3.feed_forward.pwff_layer.0.weight', 'encoder.layers.3.feed_forward.pwff_layer.3.bias', 'encoder.layers.3.feed_forward.pwff_layer.3.weight', 'encoder.layers.3.layer_norm.bias', 'encoder.layers.3.layer_norm.weight', 'encoder.layers.3.src_src_att.k_layer.bias', 'encoder.layers.3.src_src_att.k_layer.weight', 'encoder.layers.3.src_src_att.output_layer.bias', 'encoder.layers.3.src_src_att.output_layer.weight', 'encoder.layers.3.src_src_att.q_layer.bias', 'encoder.layers.3.src_src_att.q_layer.weight', 'encoder.layers.3.src_src_att.v_layer.bias', 'encoder.layers.3.src_src_att.v_layer.weight', 'encoder.layers.4.feed_forward.layer_norm.bias', 'encoder.layers.4.feed_forward.layer_norm.weight', 'encoder.layers.4.feed_forward.pwff_layer.0.bias', 'encoder.layers.4.feed_forward.pwff_layer.0.weight', 'encoder.layers.4.feed_forward.pwff_layer.3.bias', 'encoder.layers.4.feed_forward.pwff_layer.3.weight', 'encoder.layers.4.layer_norm.bias', 'encoder.layers.4.layer_norm.weight', 'encoder.layers.4.src_src_att.k_layer.bias', 'encoder.layers.4.src_src_att.k_layer.weight', 'encoder.layers.4.src_src_att.output_layer.bias', 'encoder.layers.4.src_src_att.output_layer.weight', 'encoder.layers.4.src_src_att.q_layer.bias', 'encoder.layers.4.src_src_att.q_layer.weight', 'encoder.layers.4.src_src_att.v_layer.bias', 'encoder.layers.4.src_src_att.v_layer.weight', 'encoder.layers.5.feed_forward.layer_norm.bias', 'encoder.layers.5.feed_forward.layer_norm.weight', 'encoder.layers.5.feed_forward.pwff_layer.0.bias', 'encoder.layers.5.feed_forward.pwff_layer.0.weight', 'encoder.layers.5.feed_forward.pwff_layer.3.bias', 'encoder.layers.5.feed_forward.pwff_layer.3.weight', 'encoder.layers.5.layer_norm.bias', 'encoder.layers.5.layer_norm.weight', 'encoder.layers.5.src_src_att.k_layer.bias', 'encoder.layers.5.src_src_att.k_layer.weight', 'encoder.layers.5.src_src_att.output_layer.bias', 'encoder.layers.5.src_src_att.output_layer.weight', 'encoder.layers.5.src_src_att.q_layer.bias', 'encoder.layers.5.src_src_att.q_layer.weight', 'encoder.layers.5.src_src_att.v_layer.bias', 'encoder.layers.5.src_src_att.v_layer.weight', 'src_embed.lut.weight']\n","2019-12-01 19:58:02,185 Loading model from /content/drive/My Drive/masakhane/en-sn-baseline/models/ensn_transformer/22000.ckpt\n","2019-12-01 19:58:04,641 cfg.name : ensn_transformer\n","2019-12-01 19:58:04,641 cfg.data.src : en\n","2019-12-01 19:58:04,641 cfg.data.trg : sn\n","2019-12-01 19:58:04,641 cfg.data.train : data/ensn/train.bpe\n","2019-12-01 19:58:04,641 cfg.data.dev : data/ensn/dev.bpe\n","2019-12-01 19:58:04,641 cfg.data.test : data/ensn/test.bpe\n","2019-12-01 19:58:04,641 cfg.data.level : bpe\n","2019-12-01 19:58:04,641 cfg.data.lowercase : False\n","2019-12-01 19:58:04,641 cfg.data.max_sent_length : 100\n","2019-12-01 19:58:04,641 cfg.data.src_vocab : data/ensn/vocab.txt\n","2019-12-01 19:58:04,642 cfg.data.trg_vocab : data/ensn/vocab.txt\n","2019-12-01 19:58:04,642 cfg.testing.beam_size : 5\n","2019-12-01 19:58:04,642 cfg.testing.alpha : 1.0\n","2019-12-01 19:58:04,642 cfg.training.load_model : /content/drive/My Drive/masakhane/en-sn-baseline/models/ensn_transformer/22000.ckpt\n","2019-12-01 19:58:04,642 cfg.training.random_seed : 42\n","2019-12-01 19:58:04,642 cfg.training.optimizer : adam\n","2019-12-01 19:58:04,642 cfg.training.normalization : tokens\n","2019-12-01 19:58:04,642 cfg.training.adam_betas : [0.9, 0.999]\n","2019-12-01 19:58:04,642 cfg.training.scheduling : plateau\n","2019-12-01 19:58:04,642 cfg.training.patience : 5\n","2019-12-01 19:58:04,642 cfg.training.learning_rate_factor : 0.5\n","2019-12-01 19:58:04,642 cfg.training.learning_rate_warmup : 1000\n","2019-12-01 19:58:04,642 cfg.training.decrease_factor : 0.7\n","2019-12-01 19:58:04,643 cfg.training.loss : crossentropy\n","2019-12-01 19:58:04,643 cfg.training.learning_rate : 0.0003\n","2019-12-01 19:58:04,643 cfg.training.learning_rate_min : 1e-08\n","2019-12-01 19:58:04,643 cfg.training.weight_decay : 0.0\n","2019-12-01 19:58:04,643 cfg.training.label_smoothing : 0.1\n","2019-12-01 19:58:04,643 cfg.training.batch_size : 4096\n","2019-12-01 19:58:04,643 cfg.training.batch_type : token\n","2019-12-01 19:58:04,643 cfg.training.eval_batch_size : 3600\n","2019-12-01 19:58:04,643 cfg.training.eval_batch_type : token\n","2019-12-01 19:58:04,643 cfg.training.batch_multiplier : 1\n","2019-12-01 19:58:04,643 cfg.training.early_stopping_metric : ppl\n","2019-12-01 19:58:04,643 cfg.training.epochs : 30\n","2019-12-01 19:58:04,643 cfg.training.validation_freq : 1000\n","2019-12-01 19:58:04,643 cfg.training.logging_freq : 100\n","2019-12-01 19:58:04,643 cfg.training.eval_metric : bleu\n","2019-12-01 19:58:04,644 cfg.training.model_dir : models/ensn_transformer\n","2019-12-01 19:58:04,644 cfg.training.overwrite : False\n","2019-12-01 19:58:04,644 cfg.training.shuffle : True\n","2019-12-01 19:58:04,644 cfg.training.use_cuda : True\n","2019-12-01 19:58:04,644 cfg.training.max_output_length : 100\n","2019-12-01 19:58:04,644 cfg.training.print_valid_sents : [0, 1, 2, 3]\n","2019-12-01 19:58:04,644 cfg.training.keep_last_ckpts : 3\n","2019-12-01 19:58:04,644 cfg.model.initializer : xavier\n","2019-12-01 19:58:04,644 cfg.model.bias_initializer : zeros\n","2019-12-01 19:58:04,644 cfg.model.init_gain : 1.0\n","2019-12-01 19:58:04,644 cfg.model.embed_initializer : xavier\n","2019-12-01 19:58:04,644 cfg.model.embed_init_gain : 1.0\n","2019-12-01 19:58:04,644 cfg.model.tied_embeddings : True\n","2019-12-01 19:58:04,644 cfg.model.tied_softmax : True\n","2019-12-01 19:58:04,645 cfg.model.encoder.type : transformer\n","2019-12-01 19:58:04,645 cfg.model.encoder.num_layers : 6\n","2019-12-01 19:58:04,645 cfg.model.encoder.num_heads : 4\n","2019-12-01 19:58:04,645 cfg.model.encoder.embeddings.embedding_dim : 256\n","2019-12-01 19:58:04,645 cfg.model.encoder.embeddings.scale : True\n","2019-12-01 19:58:04,645 cfg.model.encoder.embeddings.dropout : 0.2\n","2019-12-01 19:58:04,645 cfg.model.encoder.hidden_size : 256\n","2019-12-01 19:58:04,645 cfg.model.encoder.ff_size : 1024\n","2019-12-01 19:58:04,645 cfg.model.encoder.dropout : 0.3\n","2019-12-01 19:58:04,645 cfg.model.decoder.type : transformer\n","2019-12-01 19:58:04,645 cfg.model.decoder.num_layers : 6\n","2019-12-01 19:58:04,645 cfg.model.decoder.num_heads : 4\n","2019-12-01 19:58:04,645 cfg.model.decoder.embeddings.embedding_dim : 256\n","2019-12-01 19:58:04,645 cfg.model.decoder.embeddings.scale : True\n","2019-12-01 19:58:04,646 cfg.model.decoder.embeddings.dropout : 0.2\n","2019-12-01 19:58:04,646 cfg.model.decoder.hidden_size : 256\n","2019-12-01 19:58:04,646 cfg.model.decoder.ff_size : 1024\n","2019-12-01 19:58:04,646 cfg.model.decoder.dropout : 0.3\n","2019-12-01 19:58:04,646 Data set sizes: \n","\ttrain 712455,\n","\tvalid 1000,\n","\ttest 2723\n","2019-12-01 19:58:04,646 First training example:\n","\t[SRC] P@@ le@@ as@@ ed that the king asked for wisdom rather than for rich@@ es and gl@@ ory , God gave Solom@@ on “ a wise and understanding heart ” ​ — as well as pros@@ per@@ ity .\n","\t[TRG] Mwari aka@@ f@@ adzwa ne@@ chi@@ kumb@@ iro cha@@ Soromoni aka@@ mu@@ pa “ mwoyo waka@@ chenjera , un@@ onz@@ wisisa , ” uye pfuma ne@@ mu@@ kurumbira zva@@ a@@ inge asina ku@@ kumbira .\n","2019-12-01 19:58:04,646 First 10 words (src): (0) (1) (2) (3) (4) . (5) , (6) the (7) to (8) of (9) “\n","2019-12-01 19:58:04,646 First 10 words (trg): (0) (1) (2) (3) (4) . (5) , (6) the (7) to (8) of (9) “\n","2019-12-01 19:58:04,646 Number of Src words (types): 4439\n","2019-12-01 19:58:04,646 Number of Trg words (types): 4439\n","2019-12-01 19:58:04,646 Model(\n","\tencoder=TransformerEncoder(num_layers=6, num_heads=4),\n","\tdecoder=TransformerDecoder(num_layers=6, num_heads=4),\n","\tsrc_embed=Embeddings(embedding_dim=256, vocab_size=4439),\n","\ttrg_embed=Embeddings(embedding_dim=256, vocab_size=4439))\n","2019-12-01 19:58:04,651 EPOCH 1\n","2019-12-01 19:58:38,505 Epoch 1 Step: 22100 Batch Loss: 1.743695 Tokens per Sec: 7680, Lr: 0.000300\n","2019-12-01 19:59:10,626 Epoch 1 Step: 22200 Batch Loss: 1.859596 Tokens per Sec: 7970, Lr: 0.000300\n","2019-12-01 19:59:43,028 Epoch 1 Step: 22300 Batch Loss: 1.938025 Tokens per Sec: 7950, Lr: 0.000300\n","2019-12-01 20:00:15,837 Epoch 1 Step: 22400 Batch Loss: 1.684646 Tokens per Sec: 7925, Lr: 0.000300\n","2019-12-01 20:00:48,375 Epoch 1 Step: 22500 Batch Loss: 1.939360 Tokens per Sec: 8050, Lr: 0.000300\n","2019-12-01 20:01:20,765 Epoch 1 Step: 22600 Batch Loss: 2.074338 Tokens per Sec: 7928, Lr: 0.000300\n","2019-12-01 20:01:53,403 Epoch 1 Step: 22700 Batch Loss: 1.627862 Tokens per Sec: 8054, Lr: 0.000300\n","2019-12-01 20:02:25,733 Epoch 1 Step: 22800 Batch Loss: 1.931538 Tokens per Sec: 7991, Lr: 0.000300\n","2019-12-01 20:02:58,393 Epoch 1 Step: 22900 Batch Loss: 2.050825 Tokens per Sec: 7901, Lr: 0.000300\n","2019-12-01 20:03:30,362 Epoch 1 Step: 23000 Batch Loss: 1.923352 Tokens per Sec: 7845, Lr: 0.000300\n","2019-12-01 20:05:07,357 Hooray! New best validation result [ppl]!\n","2019-12-01 20:05:07,358 Saving new checkpoint.\n","2019-12-01 20:05:07,649 Example #0\n","2019-12-01 20:05:07,649 \tSource: In some cultures , birthstones are associated with the month of one’s birth .\n","2019-12-01 20:05:07,649 \tReference: Mune dzimwe tsika , matombo omwedzi wawakazvarwa anonzi ane chokuita nomwedzi wakazvarwa munhu .\n","2019-12-01 20:05:07,650 \tHypothesis: Mune dzimwe tsika , mapoka anobatanidzwawo nemwedzi womucheche .\n","2019-12-01 20:05:07,650 Example #1\n","2019-12-01 20:05:07,650 \tSource: What will help us to appreciate that God watches over us because he loves us ? Let us consider how he shows this .\n","2019-12-01 20:05:07,650 \tReference: Chii chichatibatsira kunzwisisa kuti kutariswa kwatinoitwa naMwari kunoratidza kuti anotida ?\n","2019-12-01 20:05:07,650 \tHypothesis: Chii chichatibatsira kunzwisisa kuti Mwari anotiona nemhaka yokuti anotida ?\n","2019-12-01 20:05:07,650 Example #2\n","2019-12-01 20:05:07,651 \tSource: ; Keyzer , M .\n","2019-12-01 20:05:07,651 \tReference: ; Keyzer , M .\n","2019-12-01 20:05:07,651 \tHypothesis: ; Keyzer , M .\n","2019-12-01 20:05:07,651 Example #3\n","2019-12-01 20:05:07,652 \tSource: We can give simply by sincerely asking how our friends are , by trying to understand their problems , and by doing all we can without waiting for them to ask . ”\n","2019-12-01 20:05:07,652 \tReference: Tinogona kupa kana tikangobvunza nomwoyo wose shamwari dzedu kuti dzakadini , kana tikaedza kunzwisisa zvinetso zvadzo , uye kana tikaita zvose zvatinogona tisingamiriri kuti dzikumbire . ”\n","2019-12-01 20:05:07,652 \tHypothesis: Tinogona kupa bedzi kupfurikidza nokukumbira nenzira yokusimbisa kuti shamwari dzedu dzinosvika sei , kupfurikidza nokuedza kunzwisisa zvinetso zvavo , uye kupfurikidza nokuita zvose zvatinogona kuvasati vakumbira . ”\n","2019-12-01 20:05:07,652 Validation result (greedy) at epoch 1, step 23000: bleu: 24.14, loss: 43281.3594, ppl: 4.7753, duration: 97.2904s\n","2019-12-01 20:05:40,030 Epoch 1 Step: 23100 Batch Loss: 1.942282 Tokens per Sec: 7926, Lr: 0.000300\n","2019-12-01 20:06:12,684 Epoch 1 Step: 23200 Batch Loss: 1.678775 Tokens per Sec: 7928, Lr: 0.000300\n","2019-12-01 20:06:45,054 Epoch 1 Step: 23300 Batch Loss: 1.532318 Tokens per Sec: 7927, Lr: 0.000300\n","2019-12-01 20:07:17,572 Epoch 1 Step: 23400 Batch Loss: 1.814984 Tokens per Sec: 8031, Lr: 0.000300\n","2019-12-01 20:07:50,087 Epoch 1 Step: 23500 Batch Loss: 1.904505 Tokens per Sec: 7889, Lr: 0.000300\n","2019-12-01 20:08:22,464 Epoch 1 Step: 23600 Batch Loss: 1.821336 Tokens per Sec: 7936, Lr: 0.000300\n","2019-12-01 20:08:54,943 Epoch 1 Step: 23700 Batch Loss: 1.819299 Tokens per Sec: 7984, Lr: 0.000300\n","2019-12-01 20:09:27,579 Epoch 1 Step: 23800 Batch Loss: 1.841228 Tokens per Sec: 7957, Lr: 0.000300\n","2019-12-01 20:09:59,974 Epoch 1 Step: 23900 Batch Loss: 1.698224 Tokens per Sec: 8001, Lr: 0.000300\n","2019-12-01 20:10:32,414 Epoch 1 Step: 24000 Batch Loss: 2.054336 Tokens per Sec: 7993, Lr: 0.000300\n","2019-12-01 20:12:09,321 Hooray! New best validation result [ppl]!\n","2019-12-01 20:12:09,321 Saving new checkpoint.\n","2019-12-01 20:12:09,674 Example #0\n","2019-12-01 20:12:09,674 \tSource: In some cultures , birthstones are associated with the month of one’s birth .\n","2019-12-01 20:12:09,674 \tReference: Mune dzimwe tsika , matombo omwedzi wawakazvarwa anonzi ane chokuita nomwedzi wakazvarwa munhu .\n","2019-12-01 20:12:09,674 \tHypothesis: Mune dzimwe tsika , matombo anoberekwa anosonganirana nemwedzi wokuberekwa kwomunhu .\n","2019-12-01 20:12:09,674 Example #1\n","2019-12-01 20:12:09,675 \tSource: What will help us to appreciate that God watches over us because he loves us ? Let us consider how he shows this .\n","2019-12-01 20:12:09,675 \tReference: Chii chichatibatsira kunzwisisa kuti kutariswa kwatinoitwa naMwari kunoratidza kuti anotida ?\n","2019-12-01 20:12:09,675 \tHypothesis: Chii chichatibatsira kunzwisisa kuti Mwari anotiona nemhaka yokuti anotida ?\n","2019-12-01 20:12:09,675 Example #2\n","2019-12-01 20:12:09,675 \tSource: ; Keyzer , M .\n","2019-12-01 20:12:09,675 \tReference: ; Keyzer , M .\n","2019-12-01 20:12:09,675 \tHypothesis: ; Keyzer , M .\n","2019-12-01 20:12:09,675 Example #3\n","2019-12-01 20:12:09,676 \tSource: We can give simply by sincerely asking how our friends are , by trying to understand their problems , and by doing all we can without waiting for them to ask . ”\n","2019-12-01 20:12:09,676 \tReference: Tinogona kupa kana tikangobvunza nomwoyo wose shamwari dzedu kuti dzakadini , kana tikaedza kunzwisisa zvinetso zvadzo , uye kana tikaita zvose zvatinogona tisingamiriri kuti dzikumbire . ”\n","2019-12-01 20:12:09,676 \tHypothesis: Tinogona kungopa bedzi kupfurikidza nokukumbira nenzira yomuzvarirwo nzira dzedu , kupfurikidza nokuedza kunzwisisa zvinetso zvavo , uye kupfurikidza nokuita zvose zvatinogona kuvasingakumirira kukumbira . ”\n","2019-12-01 20:12:09,676 Validation result (greedy) at epoch 1, step 24000: bleu: 24.49, loss: 42665.7891, ppl: 4.6703, duration: 97.2613s\n","2019-12-01 20:12:42,201 Epoch 1 Step: 24100 Batch Loss: 1.746611 Tokens per Sec: 7884, Lr: 0.000300\n","2019-12-01 20:13:14,847 Epoch 1 Step: 24200 Batch Loss: 1.717344 Tokens per Sec: 7932, Lr: 0.000300\n","2019-12-01 20:13:47,872 Epoch 1 Step: 24300 Batch Loss: 1.880865 Tokens per Sec: 7991, Lr: 0.000300\n","2019-12-01 20:14:20,593 Epoch 1 Step: 24400 Batch Loss: 1.824478 Tokens per Sec: 7877, Lr: 0.000300\n","2019-12-01 20:14:53,431 Epoch 1 Step: 24500 Batch Loss: 1.550000 Tokens per Sec: 7894, Lr: 0.000300\n","2019-12-01 20:15:26,536 Epoch 1 Step: 24600 Batch Loss: 1.910617 Tokens per Sec: 7890, Lr: 0.000300\n","2019-12-01 20:15:59,042 Epoch 1 Step: 24700 Batch Loss: 1.630266 Tokens per Sec: 7825, Lr: 0.000300\n","2019-12-01 20:16:32,133 Epoch 1 Step: 24800 Batch Loss: 1.888249 Tokens per Sec: 8028, Lr: 0.000300\n","2019-12-01 20:17:04,591 Epoch 1 Step: 24900 Batch Loss: 1.740532 Tokens per Sec: 7919, Lr: 0.000300\n","2019-12-01 20:17:36,709 Epoch 1 Step: 25000 Batch Loss: 2.169381 Tokens per Sec: 7881, Lr: 0.000300\n","2019-12-01 20:19:13,696 Hooray! New best validation result [ppl]!\n","2019-12-01 20:19:13,697 Saving new checkpoint.\n","2019-12-01 20:19:14,070 Example #0\n","2019-12-01 20:19:14,070 \tSource: In some cultures , birthstones are associated with the month of one’s birth .\n","2019-12-01 20:19:14,071 \tReference: Mune dzimwe tsika , matombo omwedzi wawakazvarwa anonzi ane chokuita nomwedzi wakazvarwa munhu .\n","2019-12-01 20:19:14,071 \tHypothesis: Mune dzimwe tsika , matombo anoberekwa anobatanidzwa nemwedzi wokuberekwa kwomunhu .\n","2019-12-01 20:19:14,071 Example #1\n","2019-12-01 20:19:14,071 \tSource: What will help us to appreciate that God watches over us because he loves us ? Let us consider how he shows this .\n","2019-12-01 20:19:14,071 \tReference: Chii chichatibatsira kunzwisisa kuti kutariswa kwatinoitwa naMwari kunoratidza kuti anotida ?\n","2019-12-01 20:19:14,071 \tHypothesis: Chii chichatibatsira kunzwisisa kuti Mwari anotirinda nemhaka yokuti anotida ?\n","2019-12-01 20:19:14,071 Example #2\n","2019-12-01 20:19:14,072 \tSource: ; Keyzer , M .\n","2019-12-01 20:19:14,072 \tReference: ; Keyzer , M .\n","2019-12-01 20:19:14,072 \tHypothesis: ; Keyzer , M .\n","2019-12-01 20:19:14,072 Example #3\n","2019-12-01 20:19:14,072 \tSource: We can give simply by sincerely asking how our friends are , by trying to understand their problems , and by doing all we can without waiting for them to ask . ”\n","2019-12-01 20:19:14,072 \tReference: Tinogona kupa kana tikangobvunza nomwoyo wose shamwari dzedu kuti dzakadini , kana tikaedza kunzwisisa zvinetso zvadzo , uye kana tikaita zvose zvatinogona tisingamiriri kuti dzikumbire . ”\n","2019-12-01 20:19:14,072 \tHypothesis: Tinogona kupa bedzi kupfurikidza nokukumbira nenzira yokunyatsobvuma kuti shamwari dzedu dzichinyatsonzwisisa zvinetso zvavo , uye kupfurikidza nokuita zvose zvatinogona kuvarega kukumbira . ”\n","2019-12-01 20:19:14,073 Validation result (greedy) at epoch 1, step 25000: bleu: 24.87, loss: 42263.3516, ppl: 4.6029, duration: 97.3631s\n","2019-12-01 20:19:47,258 Epoch 1 Step: 25100 Batch Loss: 1.641233 Tokens per Sec: 8037, Lr: 0.000300\n","2019-12-01 20:20:19,633 Epoch 1 Step: 25200 Batch Loss: 1.446743 Tokens per Sec: 7952, Lr: 0.000300\n","2019-12-01 20:20:52,034 Epoch 1 Step: 25300 Batch Loss: 1.606183 Tokens per Sec: 7968, Lr: 0.000300\n","2019-12-01 20:21:24,317 Epoch 1 Step: 25400 Batch Loss: 1.853166 Tokens per Sec: 7866, Lr: 0.000300\n","2019-12-01 20:21:56,567 Epoch 1 Step: 25500 Batch Loss: 2.024843 Tokens per Sec: 7853, Lr: 0.000300\n","2019-12-01 20:22:29,395 Epoch 1 Step: 25600 Batch Loss: 1.592446 Tokens per Sec: 8053, Lr: 0.000300\n","2019-12-01 20:23:02,228 Epoch 1 Step: 25700 Batch Loss: 1.692472 Tokens per Sec: 8038, Lr: 0.000300\n","2019-12-01 20:23:34,734 Epoch 1 Step: 25800 Batch Loss: 1.687892 Tokens per Sec: 7993, Lr: 0.000300\n","2019-12-01 20:24:07,175 Epoch 1 Step: 25900 Batch Loss: 2.213653 Tokens per Sec: 7850, Lr: 0.000300\n","2019-12-01 20:24:39,635 Epoch 1 Step: 26000 Batch Loss: 1.917814 Tokens per Sec: 7946, Lr: 0.000300\n","2019-12-01 20:26:16,607 Hooray! New best validation result [ppl]!\n","2019-12-01 20:26:16,607 Saving new checkpoint.\n","2019-12-01 20:26:16,991 Example #0\n","2019-12-01 20:26:16,992 \tSource: In some cultures , birthstones are associated with the month of one’s birth .\n","2019-12-01 20:26:16,992 \tReference: Mune dzimwe tsika , matombo omwedzi wawakazvarwa anonzi ane chokuita nomwedzi wakazvarwa munhu .\n","2019-12-01 20:26:16,992 \tHypothesis: Mune dzimwe tsika , mashura anobatanidzwa nemwedzi wokuberekwa kwomunhu .\n","2019-12-01 20:26:16,992 Example #1\n","2019-12-01 20:26:16,992 \tSource: What will help us to appreciate that God watches over us because he loves us ? Let us consider how he shows this .\n","2019-12-01 20:26:16,992 \tReference: Chii chichatibatsira kunzwisisa kuti kutariswa kwatinoitwa naMwari kunoratidza kuti anotida ?\n","2019-12-01 20:26:16,992 \tHypothesis: Chii chichatibatsira kunzwisisa kuti Mwari anotiona nemhaka yokuti anotida ?\n","2019-12-01 20:26:16,992 Example #2\n","2019-12-01 20:26:16,993 \tSource: ; Keyzer , M .\n","2019-12-01 20:26:16,993 \tReference: ; Keyzer , M .\n","2019-12-01 20:26:16,993 \tHypothesis: ; Keyzer , M .\n","2019-12-01 20:26:16,993 Example #3\n","2019-12-01 20:26:16,993 \tSource: We can give simply by sincerely asking how our friends are , by trying to understand their problems , and by doing all we can without waiting for them to ask . ”\n","2019-12-01 20:26:16,993 \tReference: Tinogona kupa kana tikangobvunza nomwoyo wose shamwari dzedu kuti dzakadini , kana tikaedza kunzwisisa zvinetso zvadzo , uye kana tikaita zvose zvatinogona tisingamiriri kuti dzikumbire . ”\n","2019-12-01 20:26:16,993 \tHypothesis: Tinogona bedzi kupa kupfurikidza nokukumbira nenzira yomuzvarirwo kuti shamwari dzedu , kupfurikidza nokuedza kunzwisisa zvinetso zvavo , uye kupfurikidza nokuita zvose zvatinogona kusavamirira kukumbira . ”\n","2019-12-01 20:26:16,993 Validation result (greedy) at epoch 1, step 26000: bleu: 24.76, loss: 42055.9219, ppl: 4.5686, duration: 97.3584s\n","2019-12-01 20:26:49,136 Epoch 1 Step: 26100 Batch Loss: 1.599009 Tokens per Sec: 7807, Lr: 0.000300\n","2019-12-01 20:27:21,761 Epoch 1 Step: 26200 Batch Loss: 1.853921 Tokens per Sec: 7966, Lr: 0.000300\n","2019-12-01 20:27:54,511 Epoch 1 Step: 26300 Batch Loss: 1.667404 Tokens per Sec: 8009, Lr: 0.000300\n","2019-12-01 20:28:27,110 Epoch 1 Step: 26400 Batch Loss: 1.648352 Tokens per Sec: 7967, Lr: 0.000300\n","2019-12-01 20:28:59,934 Epoch 1 Step: 26500 Batch Loss: 1.712892 Tokens per Sec: 8071, Lr: 0.000300\n","2019-12-01 20:29:32,525 Epoch 1 Step: 26600 Batch Loss: 1.702320 Tokens per Sec: 7973, Lr: 0.000300\n","2019-12-01 20:30:05,288 Epoch 1 Step: 26700 Batch Loss: 1.694931 Tokens per Sec: 8017, Lr: 0.000300\n","2019-12-01 20:30:37,966 Epoch 1 Step: 26800 Batch Loss: 1.622802 Tokens per Sec: 8011, Lr: 0.000300\n","2019-12-01 20:31:10,317 Epoch 1 Step: 26900 Batch Loss: 1.734568 Tokens per Sec: 7896, Lr: 0.000300\n","2019-12-01 20:31:43,033 Epoch 1 Step: 27000 Batch Loss: 1.818896 Tokens per Sec: 7977, Lr: 0.000300\n","2019-12-01 20:33:20,006 Hooray! New best validation result [ppl]!\n","2019-12-01 20:33:20,006 Saving new checkpoint.\n","2019-12-01 20:33:20,385 Example #0\n","2019-12-01 20:33:20,386 \tSource: In some cultures , birthstones are associated with the month of one’s birth .\n","2019-12-01 20:33:20,386 \tReference: Mune dzimwe tsika , matombo omwedzi wawakazvarwa anonzi ane chokuita nomwedzi wakazvarwa munhu .\n","2019-12-01 20:33:20,386 \tHypothesis: Mune dzimwe tsika , mashura anosonganirana nemwedzi wokuberekwa kwomunhu .\n","2019-12-01 20:33:20,386 Example #1\n","2019-12-01 20:33:20,386 \tSource: What will help us to appreciate that God watches over us because he loves us ? Let us consider how he shows this .\n","2019-12-01 20:33:20,386 \tReference: Chii chichatibatsira kunzwisisa kuti kutariswa kwatinoitwa naMwari kunoratidza kuti anotida ?\n","2019-12-01 20:33:20,386 \tHypothesis: Chii chichatibatsira kunzwisisa kuti Mwari anotirinda nemhaka yokuti anotida ?\n","2019-12-01 20:33:20,386 Example #2\n","2019-12-01 20:33:20,387 \tSource: ; Keyzer , M .\n","2019-12-01 20:33:20,387 \tReference: ; Keyzer , M .\n","2019-12-01 20:33:20,387 \tHypothesis: ; Keyzer , M .\n","2019-12-01 20:33:20,387 Example #3\n","2019-12-01 20:33:20,387 \tSource: We can give simply by sincerely asking how our friends are , by trying to understand their problems , and by doing all we can without waiting for them to ask . ”\n","2019-12-01 20:33:20,387 \tReference: Tinogona kupa kana tikangobvunza nomwoyo wose shamwari dzedu kuti dzakadini , kana tikaedza kunzwisisa zvinetso zvadzo , uye kana tikaita zvose zvatinogona tisingamiriri kuti dzikumbire . ”\n","2019-12-01 20:33:20,387 \tHypothesis: Tinogona kungotaura zvakasimba kuti shamwari dzedu dzinenge dzichiedza kunzwisisa zvinetso zvavo , uye nokuita zvose zvatinogona kuti tisazvimirira kuti dzikumbire . ”\n","2019-12-01 20:33:20,387 Validation result (greedy) at epoch 1, step 27000: bleu: 25.27, loss: 41418.3711, ppl: 4.4645, duration: 97.3541s\n","2019-12-01 20:33:52,860 Epoch 1 Step: 27100 Batch Loss: 1.736278 Tokens per Sec: 7850, Lr: 0.000300\n","2019-12-01 20:34:25,197 Epoch 1 Step: 27200 Batch Loss: 1.717129 Tokens per Sec: 7897, Lr: 0.000300\n","2019-12-01 20:34:58,036 Epoch 1 Step: 27300 Batch Loss: 1.532034 Tokens per Sec: 7946, Lr: 0.000300\n","2019-12-01 20:35:30,855 Epoch 1 Step: 27400 Batch Loss: 1.969366 Tokens per Sec: 8052, Lr: 0.000300\n","2019-12-01 20:36:03,486 Epoch 1 Step: 27500 Batch Loss: 1.808777 Tokens per Sec: 7920, Lr: 0.000300\n","2019-12-01 20:36:36,222 Epoch 1 Step: 27600 Batch Loss: 1.665854 Tokens per Sec: 8132, Lr: 0.000300\n","2019-12-01 20:37:08,594 Epoch 1 Step: 27700 Batch Loss: 1.821612 Tokens per Sec: 7965, Lr: 0.000300\n","2019-12-01 20:37:41,081 Epoch 1 Step: 27800 Batch Loss: 1.855966 Tokens per Sec: 7914, Lr: 0.000300\n","2019-12-01 20:38:13,863 Epoch 1 Step: 27900 Batch Loss: 1.364342 Tokens per Sec: 7900, Lr: 0.000300\n","2019-12-01 20:38:46,116 Epoch 1 Step: 28000 Batch Loss: 1.931425 Tokens per Sec: 7869, Lr: 0.000300\n","2019-12-01 20:40:22,995 Hooray! New best validation result [ppl]!\n","2019-12-01 20:40:22,995 Saving new checkpoint.\n","2019-12-01 20:40:23,375 Example #0\n","2019-12-01 20:40:23,375 \tSource: In some cultures , birthstones are associated with the month of one’s birth .\n","2019-12-01 20:40:23,375 \tReference: Mune dzimwe tsika , matombo omwedzi wawakazvarwa anonzi ane chokuita nomwedzi wakazvarwa munhu .\n","2019-12-01 20:40:23,375 \tHypothesis: Mune dzimwe tsika , matombo anoberekwa ari mwedzi wokuberekwa kwomunhu .\n","2019-12-01 20:40:23,375 Example #1\n","2019-12-01 20:40:23,376 \tSource: What will help us to appreciate that God watches over us because he loves us ? Let us consider how he shows this .\n","2019-12-01 20:40:23,376 \tReference: Chii chichatibatsira kunzwisisa kuti kutariswa kwatinoitwa naMwari kunoratidza kuti anotida ?\n","2019-12-01 20:40:23,376 \tHypothesis: Chii chichatibatsira kunzwisisa kuti Mwari anotiona nemhaka yokuti anotida ?\n","2019-12-01 20:40:23,376 Example #2\n","2019-12-01 20:40:23,376 \tSource: ; Keyzer , M .\n","2019-12-01 20:40:23,376 \tReference: ; Keyzer , M .\n","2019-12-01 20:40:23,376 \tHypothesis: ; Keyzer , M .\n","2019-12-01 20:40:23,376 Example #3\n","2019-12-01 20:40:23,376 \tSource: We can give simply by sincerely asking how our friends are , by trying to understand their problems , and by doing all we can without waiting for them to ask . ”\n","2019-12-01 20:40:23,376 \tReference: Tinogona kupa kana tikangobvunza nomwoyo wose shamwari dzedu kuti dzakadini , kana tikaedza kunzwisisa zvinetso zvadzo , uye kana tikaita zvose zvatinogona tisingamiriri kuti dzikumbire . ”\n","2019-12-01 20:40:23,376 \tHypothesis: Tinogona kungopiwa kupfurikidza nokukumbira nenzira yapachokwadi kuti shamwari dzedu , kupfurikidza nokuedza kunzwisisa zvinetso zvavo , uye kupfurikidza nokuita zvose zvatinogona kusakamirira nokuda kwavo kukumbira . ”\n","2019-12-01 20:40:23,376 Validation result (greedy) at epoch 1, step 28000: bleu: 25.53, loss: 41171.5195, ppl: 4.4249, duration: 97.2602s\n","2019-12-01 20:40:55,877 Epoch 1 Step: 28100 Batch Loss: 1.784812 Tokens per Sec: 8026, Lr: 0.000300\n","2019-12-01 20:41:28,621 Epoch 1 Step: 28200 Batch Loss: 1.710928 Tokens per Sec: 7998, Lr: 0.000300\n","2019-12-01 20:42:01,499 Epoch 1 Step: 28300 Batch Loss: 1.586245 Tokens per Sec: 8000, Lr: 0.000300\n","2019-12-01 20:42:34,046 Epoch 1 Step: 28400 Batch Loss: 1.907929 Tokens per Sec: 8005, Lr: 0.000300\n","Traceback (most recent call last):\n"," File \"/usr/lib/python3.6/runpy.py\", line 193, in _run_module_as_main\n"," \"__main__\", mod_spec)\n"," File \"/usr/lib/python3.6/runpy.py\", line 85, in _run_code\n"," exec(code, run_globals)\n"," File \"/content/joeynmt/joeynmt/__main__.py\", line 41, in \n"," main()\n"," File \"/content/joeynmt/joeynmt/__main__.py\", line 29, in main\n"," train(cfg_file=args.config_path)\n"," File \"/content/joeynmt/joeynmt/training.py\", line 596, in train\n"," trainer.train_and_validate(train_data=train_data, valid_data=dev_data)\n"," File \"/content/joeynmt/joeynmt/training.py\", line 296, in train_and_validate\n"," batch_loss = self._train_batch(batch, update=update)\n"," File \"/content/joeynmt/joeynmt/training.py\", line 451, in _train_batch\n"," norm_batch_multiply.backward()\n"," File \"/usr/local/lib/python3.6/dist-packages/torch/tensor.py\", line 166, in backward\n"," torch.autograd.backward(self, gradient, retain_graph, create_graph)\n"," File \"/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py\", line 99, in backward\n"," allow_unreachable=True) # allow_unreachable flag\n","KeyboardInterrupt\n","^C\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"MBoDS09JM807","outputId":"c30390ab-aeca-43a2-8692-6c55a3e1d36f","executionInfo":{"status":"ok","timestamp":1575232974974,"user_tz":-120,"elapsed":3585886,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":34}},"source":["# Copy the created models from the notebook storage to google drive for persistant storage \n","!cp -r joeynmt/models/${src}${tgt}_transformer/* \"$gdrive_path/models/${src}${tgt}_transformer/\""],"execution_count":0,"outputs":[{"output_type":"stream","text":["cp: cannot create symbolic link '/content/drive/My Drive/masakhane/en-sn-baseline/models/ensn_transformer/best.ckpt': Function not implemented\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"n94wlrCjVc17","outputId":"d913b3bf-a2d5-4e2d-99a4-b6ed48fe5efb","executionInfo":{"status":"ok","timestamp":1575232978329,"user_tz":-120,"elapsed":3589228,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":119}},"source":["# Output our validation accuracy\n","! cat \"$gdrive_path/models/${src}${tgt}_transformer/validations.txt\""],"execution_count":0,"outputs":[{"output_type":"stream","text":["Steps: 23000\tLoss: 43281.35938\tPPL: 4.77533\tbleu: 24.14470\tLR: 0.00030000\t*\n","Steps: 24000\tLoss: 42665.78906\tPPL: 4.67032\tbleu: 24.49079\tLR: 0.00030000\t*\n","Steps: 25000\tLoss: 42263.35156\tPPL: 4.60291\tbleu: 24.86899\tLR: 0.00030000\t*\n","Steps: 26000\tLoss: 42055.92188\tPPL: 4.56855\tbleu: 24.76270\tLR: 0.00030000\t*\n","Steps: 27000\tLoss: 41418.37109\tPPL: 4.46454\tbleu: 25.27372\tLR: 0.00030000\t*\n","Steps: 28000\tLoss: 41171.51953\tPPL: 4.42491\tbleu: 25.52533\tLR: 0.00030000\t*\n"],"name":"stdout"}]},{"cell_type":"code","metadata":{"colab_type":"code","id":"66WhRE9lIhoD","outputId":"be15380b-a6e3-4743-d679-1f1ad8388ca7","executionInfo":{"status":"ok","timestamp":1575233206321,"user_tz":-120,"elapsed":187046,"user":{"displayName":"B Sibanda","photoUrl":"https://lh3.googleusercontent.com/a-/AAuE7mCSV7K2cBGXbchXb_GZQ0AvZIZXLNRMlYW0o98G1A=s64","userId":"07124125854206307932"}},"colab":{"base_uri":"https://localhost:8080/","height":68}},"source":["# Test our model\n","! cd joeynmt; python3 -m joeynmt test \"$gdrive_path/models/${src}${tgt}_transformer/config.yaml\"\n","\n","#python3 -m joeynmt test reverse_model/config.yaml --output_path reverse_model/predictions\n"],"execution_count":0,"outputs":[{"output_type":"stream","text":["2019-12-01 20:43:44,537 Hello! This is Joey-NMT.\n","2019-12-01 20:45:06,790 dev bleu: 25.04 [Beam search decoding with beam size = 5 and alpha = 1.0]\n","2019-12-01 20:46:42,191 test bleu: 30.84 [Beam search decoding with beam size = 5 and alpha = 1.0]\n"],"name":"stdout"}]}]}