[2021-03-28 08:28:45] [marian] Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-03-28 08:28:45] [marian] Running on r15g05.bullx as process 52357 with command line: [2021-03-28 08:28:45] [marian] /projappl/project_2001194/marian/build/marian --guided-alignment /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz --early-stopping 15 --valid-freq 10000 --valid-sets /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k --valid-metrics perplexity --valid-mini-batch 16 --valid-log /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log --beam-size 12 --normalize 1 --allow-unk --overwrite --keep-best --model /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz --type transformer --train-sets /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz --max-length 500 --vocabs /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml --mini-batch-fit -w 24000 --maxi-batch 500 --save-freq 10000 --disp-freq 10000 --log /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --fp16 --tied-embeddings-all --devices 0 1 2 3 --sync-sgd --seed 1111 --sqlite --tempdir /run/nvme/job_5337491/data --exponential-smoothing [2021-03-28 08:28:45] [config] after: 0e [2021-03-28 08:28:45] [config] after-batches: 0 [2021-03-28 08:28:45] [config] after-epochs: 0 [2021-03-28 08:28:45] [config] all-caps-every: 0 [2021-03-28 08:28:45] [config] allow-unk: true [2021-03-28 08:28:45] [config] authors: false [2021-03-28 08:28:45] [config] beam-size: 12 [2021-03-28 08:28:45] [config] bert-class-symbol: "[CLS]" [2021-03-28 08:28:45] [config] bert-mask-symbol: "[MASK]" [2021-03-28 08:28:45] [config] bert-masking-fraction: 0.15 [2021-03-28 08:28:45] [config] bert-sep-symbol: "[SEP]" [2021-03-28 08:28:45] [config] bert-train-type-embeddings: true [2021-03-28 08:28:45] [config] bert-type-vocab-size: 2 [2021-03-28 08:28:45] [config] build-info: "" [2021-03-28 08:28:45] [config] cite: false [2021-03-28 08:28:45] [config] clip-norm: 5 [2021-03-28 08:28:45] [config] cost-scaling: [2021-03-28 08:28:45] [config] - 7 [2021-03-28 08:28:45] [config] - 2000 [2021-03-28 08:28:45] [config] - 2 [2021-03-28 08:28:45] [config] - 0.05 [2021-03-28 08:28:45] [config] - 10 [2021-03-28 08:28:45] [config] - 1 [2021-03-28 08:28:45] [config] cost-type: ce-sum [2021-03-28 08:28:45] [config] cpu-threads: 0 [2021-03-28 08:28:45] [config] data-weighting: "" [2021-03-28 08:28:45] [config] data-weighting-type: sentence [2021-03-28 08:28:45] [config] dec-cell: gru [2021-03-28 08:28:45] [config] dec-cell-base-depth: 2 [2021-03-28 08:28:45] [config] dec-cell-high-depth: 1 [2021-03-28 08:28:45] [config] dec-depth: 6 [2021-03-28 08:28:45] [config] devices: [2021-03-28 08:28:45] [config] - 0 [2021-03-28 08:28:45] [config] - 1 [2021-03-28 08:28:45] [config] - 2 [2021-03-28 08:28:45] [config] - 3 [2021-03-28 08:28:45] [config] dim-emb: 512 [2021-03-28 08:28:45] [config] dim-rnn: 1024 [2021-03-28 08:28:45] [config] dim-vocabs: [2021-03-28 08:28:45] [config] - 56521 [2021-03-28 08:28:45] [config] - 56521 [2021-03-28 08:28:45] [config] disp-first: 0 [2021-03-28 08:28:45] [config] disp-freq: 10000 [2021-03-28 08:28:45] [config] disp-label-counts: true [2021-03-28 08:28:45] [config] dropout-rnn: 0 [2021-03-28 08:28:45] [config] dropout-src: 0 [2021-03-28 08:28:45] [config] dropout-trg: 0 [2021-03-28 08:28:45] [config] dump-config: "" [2021-03-28 08:28:45] [config] early-stopping: 15 [2021-03-28 08:28:45] [config] embedding-fix-src: false [2021-03-28 08:28:45] [config] embedding-fix-trg: false [2021-03-28 08:28:45] [config] embedding-normalization: false [2021-03-28 08:28:45] [config] embedding-vectors: [2021-03-28 08:28:45] [config] [] [2021-03-28 08:28:45] [config] enc-cell: gru [2021-03-28 08:28:45] [config] enc-cell-depth: 1 [2021-03-28 08:28:45] [config] enc-depth: 6 [2021-03-28 08:28:45] [config] enc-type: bidirectional [2021-03-28 08:28:45] [config] english-title-case-every: 0 [2021-03-28 08:28:45] [config] exponential-smoothing: 0.0001 [2021-03-28 08:28:45] [config] factor-weight: 1 [2021-03-28 08:28:45] [config] grad-dropping-momentum: 0 [2021-03-28 08:28:45] [config] grad-dropping-rate: 0 [2021-03-28 08:28:45] [config] grad-dropping-warmup: 100 [2021-03-28 08:28:45] [config] gradient-checkpointing: false [2021-03-28 08:28:45] [config] guided-alignment: /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-03-28 08:28:45] [config] guided-alignment-cost: mse [2021-03-28 08:28:45] [config] guided-alignment-weight: 0.1 [2021-03-28 08:28:45] [config] ignore-model-config: false [2021-03-28 08:28:45] [config] input-types: [2021-03-28 08:28:45] [config] [] [2021-03-28 08:28:45] [config] interpolate-env-vars: false [2021-03-28 08:28:45] [config] keep-best: true [2021-03-28 08:28:45] [config] label-smoothing: 0.1 [2021-03-28 08:28:45] [config] layer-normalization: false [2021-03-28 08:28:45] [config] learn-rate: 0.0003 [2021-03-28 08:28:45] [config] lemma-dim-emb: 0 [2021-03-28 08:28:45] [config] log: /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log [2021-03-28 08:28:45] [config] log-level: info [2021-03-28 08:28:45] [config] log-time-zone: "" [2021-03-28 08:28:45] [config] logical-epoch: [2021-03-28 08:28:45] [config] - 1e [2021-03-28 08:28:45] [config] - 0 [2021-03-28 08:28:45] [config] lr-decay: 0 [2021-03-28 08:28:45] [config] lr-decay-freq: 50000 [2021-03-28 08:28:45] [config] lr-decay-inv-sqrt: [2021-03-28 08:28:45] [config] - 16000 [2021-03-28 08:28:45] [config] lr-decay-repeat-warmup: false [2021-03-28 08:28:45] [config] lr-decay-reset-optimizer: false [2021-03-28 08:28:45] [config] lr-decay-start: [2021-03-28 08:28:45] [config] - 10 [2021-03-28 08:28:45] [config] - 1 [2021-03-28 08:28:45] [config] lr-decay-strategy: epoch+stalled [2021-03-28 08:28:45] [config] lr-report: true [2021-03-28 08:28:45] [config] lr-warmup: 16000 [2021-03-28 08:28:45] [config] lr-warmup-at-reload: false [2021-03-28 08:28:45] [config] lr-warmup-cycle: false [2021-03-28 08:28:45] [config] lr-warmup-start-rate: 0 [2021-03-28 08:28:45] [config] max-length: 500 [2021-03-28 08:28:45] [config] max-length-crop: false [2021-03-28 08:28:45] [config] max-length-factor: 3 [2021-03-28 08:28:45] [config] maxi-batch: 500 [2021-03-28 08:28:45] [config] maxi-batch-sort: trg [2021-03-28 08:28:45] [config] mini-batch: 64 [2021-03-28 08:28:45] [config] mini-batch-fit: true [2021-03-28 08:28:45] [config] mini-batch-fit-step: 10 [2021-03-28 08:28:45] [config] mini-batch-track-lr: false [2021-03-28 08:28:45] [config] mini-batch-warmup: 0 [2021-03-28 08:28:45] [config] mini-batch-words: 0 [2021-03-28 08:28:45] [config] mini-batch-words-ref: 0 [2021-03-28 08:28:45] [config] model: /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 08:28:45] [config] multi-loss-type: sum [2021-03-28 08:28:45] [config] multi-node: false [2021-03-28 08:28:45] [config] multi-node-overlap: true [2021-03-28 08:28:45] [config] n-best: false [2021-03-28 08:28:45] [config] no-nccl: false [2021-03-28 08:28:45] [config] no-reload: false [2021-03-28 08:28:45] [config] no-restore-corpus: false [2021-03-28 08:28:45] [config] normalize: 1 [2021-03-28 08:28:45] [config] normalize-gradient: false [2021-03-28 08:28:45] [config] num-devices: 0 [2021-03-28 08:28:45] [config] optimizer: adam [2021-03-28 08:28:45] [config] optimizer-delay: 1 [2021-03-28 08:28:45] [config] optimizer-params: [2021-03-28 08:28:45] [config] - 0.9 [2021-03-28 08:28:45] [config] - 0.98 [2021-03-28 08:28:45] [config] - 1e-09 [2021-03-28 08:28:45] [config] output-omit-bias: false [2021-03-28 08:28:45] [config] overwrite: true [2021-03-28 08:28:45] [config] precision: [2021-03-28 08:28:45] [config] - float16 [2021-03-28 08:28:45] [config] - float32 [2021-03-28 08:28:45] [config] - float32 [2021-03-28 08:28:45] [config] pretrained-model: "" [2021-03-28 08:28:45] [config] quantize-biases: false [2021-03-28 08:28:45] [config] quantize-bits: 0 [2021-03-28 08:28:45] [config] quantize-log-based: false [2021-03-28 08:28:45] [config] quantize-optimization-steps: 0 [2021-03-28 08:28:45] [config] quiet: false [2021-03-28 08:28:45] [config] quiet-translation: false [2021-03-28 08:28:45] [config] relative-paths: false [2021-03-28 08:28:45] [config] right-left: false [2021-03-28 08:28:45] [config] save-freq: 10000 [2021-03-28 08:28:45] [config] seed: 1111 [2021-03-28 08:28:45] [config] sentencepiece-alphas: [2021-03-28 08:28:45] [config] [] [2021-03-28 08:28:45] [config] sentencepiece-max-lines: 2000000 [2021-03-28 08:28:45] [config] sentencepiece-options: "" [2021-03-28 08:28:45] [config] shuffle: data [2021-03-28 08:28:45] [config] shuffle-in-ram: false [2021-03-28 08:28:45] [config] sigterm: save-and-exit [2021-03-28 08:28:45] [config] skip: false [2021-03-28 08:28:45] [config] sqlite: temporary [2021-03-28 08:28:45] [config] sqlite-drop: false [2021-03-28 08:28:45] [config] sync-sgd: true [2021-03-28 08:28:45] [config] tempdir: /run/nvme/job_5337491/data [2021-03-28 08:28:45] [config] tied-embeddings: false [2021-03-28 08:28:45] [config] tied-embeddings-all: true [2021-03-28 08:28:45] [config] tied-embeddings-src: false [2021-03-28 08:28:45] [config] train-embedder-rank: [2021-03-28 08:28:45] [config] [] [2021-03-28 08:28:45] [config] train-sets: [2021-03-28 08:28:45] [config] - /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz [2021-03-28 08:28:45] [config] - /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz [2021-03-28 08:28:45] [config] transformer-aan-activation: swish [2021-03-28 08:28:45] [config] transformer-aan-depth: 2 [2021-03-28 08:28:45] [config] transformer-aan-nogate: false [2021-03-28 08:28:45] [config] transformer-decoder-autoreg: self-attention [2021-03-28 08:28:45] [config] transformer-depth-scaling: false [2021-03-28 08:28:45] [config] transformer-dim-aan: 2048 [2021-03-28 08:28:45] [config] transformer-dim-ffn: 2048 [2021-03-28 08:28:45] [config] transformer-dropout: 0.1 [2021-03-28 08:28:45] [config] transformer-dropout-attention: 0 [2021-03-28 08:28:45] [config] transformer-dropout-ffn: 0 [2021-03-28 08:28:45] [config] transformer-ffn-activation: swish [2021-03-28 08:28:45] [config] transformer-ffn-depth: 2 [2021-03-28 08:28:45] [config] transformer-guided-alignment-layer: last [2021-03-28 08:28:45] [config] transformer-heads: 8 [2021-03-28 08:28:45] [config] transformer-no-projection: false [2021-03-28 08:28:45] [config] transformer-pool: false [2021-03-28 08:28:45] [config] transformer-postprocess: dan [2021-03-28 08:28:45] [config] transformer-postprocess-emb: d [2021-03-28 08:28:45] [config] transformer-postprocess-top: "" [2021-03-28 08:28:45] [config] transformer-preprocess: "" [2021-03-28 08:28:45] [config] transformer-tied-layers: [2021-03-28 08:28:45] [config] [] [2021-03-28 08:28:45] [config] transformer-train-position-embeddings: false [2021-03-28 08:28:45] [config] tsv: false [2021-03-28 08:28:45] [config] tsv-fields: 0 [2021-03-28 08:28:45] [config] type: transformer [2021-03-28 08:28:45] [config] ulr: false [2021-03-28 08:28:45] [config] ulr-dim-emb: 0 [2021-03-28 08:28:45] [config] ulr-dropout: 0 [2021-03-28 08:28:45] [config] ulr-keys-vectors: "" [2021-03-28 08:28:45] [config] ulr-query-vectors: "" [2021-03-28 08:28:45] [config] ulr-softmax-temperature: 1 [2021-03-28 08:28:45] [config] ulr-trainable-transformation: false [2021-03-28 08:28:45] [config] unlikelihood-loss: false [2021-03-28 08:28:45] [config] valid-freq: 10000 [2021-03-28 08:28:45] [config] valid-log: /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log [2021-03-28 08:28:45] [config] valid-max-length: 1000 [2021-03-28 08:28:45] [config] valid-metrics: [2021-03-28 08:28:45] [config] - perplexity [2021-03-28 08:28:45] [config] valid-mini-batch: 16 [2021-03-28 08:28:45] [config] valid-reset-stalled: false [2021-03-28 08:28:45] [config] valid-script-args: [2021-03-28 08:28:45] [config] [] [2021-03-28 08:28:45] [config] valid-script-path: "" [2021-03-28 08:28:45] [config] valid-sets: [2021-03-28 08:28:45] [config] - /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k [2021-03-28 08:28:45] [config] - /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k [2021-03-28 08:28:45] [config] valid-translation-output: "" [2021-03-28 08:28:45] [config] version: v1.9.25; 50ce630 2020-06-19 10:28:45 -0700 [2021-03-28 08:28:45] [config] vocabs: [2021-03-28 08:28:45] [config] - /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-03-28 08:28:45] [config] - /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-03-28 08:28:45] [config] word-penalty: 0 [2021-03-28 08:28:45] [config] word-scores: false [2021-03-28 08:28:45] [config] workspace: 24000 [2021-03-28 08:28:45] [config] Loaded model has been created with Marian v1.9.25; 50ce630 2020-06-19 10:28:45 -0700, will be overwritten with current version v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 at saving [2021-03-28 08:28:45] Using synchronous SGD [2021-03-28 08:28:45] [data] Loading vocabulary from JSON/Yaml file /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-03-28 08:28:45] [data] Setting vocabulary size for input 0 to 56,521 [2021-03-28 08:28:45] [data] Loading vocabulary from JSON/Yaml file /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-03-28 08:28:45] [data] Setting vocabulary size for input 1 to 56,521 [2021-03-28 08:28:45] [data] Using word alignments from file /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-03-28 08:28:45] [sqlite] Creating temporary database in /run/nvme/job_5337491/data [2021-03-28 08:28:49] [sqlite] Inserted 1000000 lines [2021-03-28 08:28:51] [sqlite] Inserted 2000000 lines [2021-03-28 08:28:57] [sqlite] Inserted 4000000 lines [2021-03-28 08:29:09] [sqlite] Inserted 8000000 lines [2021-03-28 08:29:33] [sqlite] Inserted 16000000 lines [2021-03-28 08:30:19] [sqlite] Inserted 32000000 lines [2021-03-28 08:31:35] [sqlite] Inserted 64000000 lines [2021-03-28 08:32:27] [sqlite] Inserted 82557593 lines [2021-03-28 08:32:27] [sqlite] Creating primary index [2021-03-28 08:33:04] [comm] Compiled without MPI support. Running as a single process on r15g05.bullx [2021-03-28 08:33:04] [batching] Collecting statistics for batch fitting with step size 10 [2021-03-28 08:33:18] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-03-28 08:33:18] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-03-28 08:33:19] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-03-28 08:33:19] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-03-28 08:33:19] [comm] Using NCCL 2.8.3 for GPU communication [2021-03-28 08:33:20] [comm] NCCLCommunicator constructed successfully [2021-03-28 08:33:20] [training] Using 4 GPUs [2021-03-28 08:33:21] [logits] Applying loss function for 1 factor(s) [2021-03-28 08:33:21] [memory] Reserving 278 MB, device gpu0 [2021-03-28 08:33:22] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2021-03-28 08:33:22] [memory] Reserving 278 MB, device gpu0 [2021-03-28 08:36:15] [batching] Done. Typical MB size is 58,832 target words [2021-03-28 08:36:15] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-03-28 08:36:16] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-03-28 08:36:16] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-03-28 08:36:16] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-03-28 08:36:16] [comm] Using NCCL 2.8.3 for GPU communication [2021-03-28 08:36:17] [comm] NCCLCommunicator constructed successfully [2021-03-28 08:36:17] [training] Using 4 GPUs [2021-03-28 08:36:17] Loading model from /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 08:36:18] Loading model from /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 08:36:18] Loading model from /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 08:36:19] Loading model from /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 08:36:19] [training] Model reloaded from /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 08:36:19] Training started [2021-03-28 08:36:19] [sqlite] Selecting shuffled data [2021-03-28 08:37:41] [training] Batches are processed as 1 process(es) x 4 devices/process [2021-03-28 08:37:41] [memory] Reserving 278 MB, device gpu0 [2021-03-28 08:37:41] [memory] Reserving 278 MB, device gpu1 [2021-03-28 08:37:41] [memory] Reserving 278 MB, device gpu3 [2021-03-28 08:37:41] [memory] Reserving 278 MB, device gpu2 [2021-03-28 08:37:42] [memory] Reserving 278 MB, device gpu0 [2021-03-28 08:37:42] [memory] Reserving 278 MB, device gpu1 [2021-03-28 08:37:42] [memory] Reserving 278 MB, device gpu3 [2021-03-28 08:37:42] [memory] Reserving 278 MB, device gpu2 [2021-03-28 08:37:42] [memory] Reserving 69 MB, device gpu0 [2021-03-28 08:37:42] [memory] Reserving 69 MB, device gpu1 [2021-03-28 08:37:42] [memory] Reserving 69 MB, device gpu2 [2021-03-28 08:37:42] [memory] Reserving 69 MB, device gpu3 [2021-03-28 08:37:43] [memory] Reserving 139 MB, device gpu2 [2021-03-28 08:37:43] [memory] Reserving 139 MB, device gpu3 [2021-03-28 08:37:43] [memory] Reserving 139 MB, device gpu0 [2021-03-28 08:37:43] [memory] Reserving 139 MB, device gpu1 [2021-03-28 10:42:35] Ep. 1 : Up. 10000 : Sen. 9,319,466 : Cost 0.35629028 * 1,366,819,140 @ 13,946 after 1,366,819,140 : Time 7579.21s : 24583.84 words/s : L.r. 1.8750e-04 [2021-03-28 10:42:35] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-03-28 10:42:37] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 10:42:38] Saving Adam parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-03-28 10:42:45] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-03-28 10:42:46] [valid] Ep. 1 : Up. 10000 : perplexity : 1.88848 : new best [2021-03-28 12:48:01] Ep. 1 : Up. 20000 : Sen. 18,649,600 : Cost 0.36928299 * 1,369,675,984 @ 17,845 after 2,736,495,124 : Time 7525.85s : 24810.72 words/s : L.r. 2.6833e-04 [2021-03-28 12:48:01] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-03-28 12:48:02] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 12:48:04] Saving Adam parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-03-28 12:48:11] [valid] Ep. 1 : Up. 20000 : perplexity : 1.91167 : stalled 1 times (last best: 1.88848) [2021-03-28 14:53:07] Ep. 1 : Up. 30000 : Sen. 27,953,580 : Cost 0.36638483 * 1,367,997,648 @ 8,256 after 4,104,492,772 : Time 7506.14s : 24785.84 words/s : L.r. 2.1909e-04 [2021-03-28 14:53:07] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-03-28 14:53:09] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 14:53:10] Saving Adam parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-03-28 14:53:16] [valid] Ep. 1 : Up. 30000 : perplexity : 1.90984 : stalled 2 times (last best: 1.88848) [2021-03-28 16:58:12] Ep. 1 : Up. 40000 : Sen. 37,260,234 : Cost 0.36251137 * 1,369,286,966 @ 18,192 after 5,473,779,738 : Time 7504.84s : 24806.19 words/s : L.r. 1.8974e-04 [2021-03-28 16:58:12] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-03-28 16:58:13] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 16:58:15] Saving Adam parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-03-28 16:58:21] [valid] Ep. 1 : Up. 40000 : perplexity : 1.90168 : stalled 3 times (last best: 1.88848) [2021-03-28 19:03:44] Ep. 1 : Up. 50000 : Sen. 46,612,910 : Cost 0.36028075 * 1,374,594,889 @ 23,886 after 6,848,374,627 : Time 7532.40s : 24829.36 words/s : L.r. 1.6971e-04 [2021-03-28 19:03:44] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-03-28 19:03:46] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 19:03:48] Saving Adam parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-03-28 19:03:54] [valid] Ep. 1 : Up. 50000 : perplexity : 1.90109 : stalled 4 times (last best: 1.88848) [2021-03-28 21:09:32] Ep. 1 : Up. 60000 : Sen. 55,970,228 : Cost 0.35844511 * 1,376,778,359 @ 10,298 after 8,225,152,986 : Time 7547.40s : 24812.41 words/s : L.r. 1.5492e-04 [2021-03-28 21:09:32] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-03-28 21:09:34] Saving model weights and runtime parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-03-28 21:09:35] Saving Adam parameters to /scratch/project_2001194/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-03-28 21:09:42] [valid] Ep. 1 : Up. 60000 : perplexity : 1.90035 : stalled 5 times (last best: 1.88848) [2021-04-02 07:35:49] [marian] Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-02 07:35:49] [marian] Running on r14g04.bullx as process 25191 with command line: [2021-04-02 07:35:49] [marian] /projappl/project_2001194/marian/build/marian --guided-alignment /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz --early-stopping 15 --valid-freq 10000 --valid-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k --valid-metrics perplexity --valid-mini-batch 16 --valid-log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log --beam-size 12 --normalize 1 --allow-unk --overwrite --keep-best --model /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz --type transformer --train-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz --max-length 500 --vocabs /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml --mini-batch-fit -w 24000 --maxi-batch 500 --save-freq 10000 --disp-freq 10000 --log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --fp16 --tied-embeddings-all --devices 0 1 2 3 --sync-sgd --seed 1111 --sqlite --tempdir /run/nvme/job_5389725/data --exponential-smoothing [2021-04-02 07:35:52] [config] after: 0e [2021-04-02 07:35:52] [config] after-batches: 0 [2021-04-02 07:35:52] [config] after-epochs: 0 [2021-04-02 07:35:52] [config] all-caps-every: 0 [2021-04-02 07:35:52] [config] allow-unk: true [2021-04-02 07:35:52] [config] authors: false [2021-04-02 07:35:52] [config] beam-size: 12 [2021-04-02 07:35:52] [config] bert-class-symbol: "[CLS]" [2021-04-02 07:35:52] [config] bert-mask-symbol: "[MASK]" [2021-04-02 07:35:52] [config] bert-masking-fraction: 0.15 [2021-04-02 07:35:52] [config] bert-sep-symbol: "[SEP]" [2021-04-02 07:35:52] [config] bert-train-type-embeddings: true [2021-04-02 07:35:52] [config] bert-type-vocab-size: 2 [2021-04-02 07:35:52] [config] build-info: "" [2021-04-02 07:35:52] [config] cite: false [2021-04-02 07:35:52] [config] clip-norm: 5 [2021-04-02 07:35:52] [config] cost-scaling: [2021-04-02 07:35:52] [config] - 7 [2021-04-02 07:35:52] [config] - 2000 [2021-04-02 07:35:52] [config] - 2 [2021-04-02 07:35:52] [config] - 0.05 [2021-04-02 07:35:52] [config] - 10 [2021-04-02 07:35:52] [config] - 1 [2021-04-02 07:35:52] [config] cost-type: ce-sum [2021-04-02 07:35:52] [config] cpu-threads: 0 [2021-04-02 07:35:52] [config] data-weighting: "" [2021-04-02 07:35:52] [config] data-weighting-type: sentence [2021-04-02 07:35:52] [config] dec-cell: gru [2021-04-02 07:35:52] [config] dec-cell-base-depth: 2 [2021-04-02 07:35:52] [config] dec-cell-high-depth: 1 [2021-04-02 07:35:52] [config] dec-depth: 6 [2021-04-02 07:35:52] [config] devices: [2021-04-02 07:35:52] [config] - 0 [2021-04-02 07:35:52] [config] - 1 [2021-04-02 07:35:52] [config] - 2 [2021-04-02 07:35:52] [config] - 3 [2021-04-02 07:35:52] [config] dim-emb: 512 [2021-04-02 07:35:52] [config] dim-rnn: 1024 [2021-04-02 07:35:52] [config] dim-vocabs: [2021-04-02 07:35:52] [config] - 56521 [2021-04-02 07:35:52] [config] - 56521 [2021-04-02 07:35:52] [config] disp-first: 0 [2021-04-02 07:35:52] [config] disp-freq: 10000 [2021-04-02 07:35:52] [config] disp-label-counts: true [2021-04-02 07:35:52] [config] dropout-rnn: 0 [2021-04-02 07:35:52] [config] dropout-src: 0 [2021-04-02 07:35:52] [config] dropout-trg: 0 [2021-04-02 07:35:52] [config] dump-config: "" [2021-04-02 07:35:52] [config] early-stopping: 15 [2021-04-02 07:35:52] [config] embedding-fix-src: false [2021-04-02 07:35:52] [config] embedding-fix-trg: false [2021-04-02 07:35:52] [config] embedding-normalization: false [2021-04-02 07:35:52] [config] embedding-vectors: [2021-04-02 07:35:52] [config] [] [2021-04-02 07:35:52] [config] enc-cell: gru [2021-04-02 07:35:52] [config] enc-cell-depth: 1 [2021-04-02 07:35:52] [config] enc-depth: 6 [2021-04-02 07:35:52] [config] enc-type: bidirectional [2021-04-02 07:35:52] [config] english-title-case-every: 0 [2021-04-02 07:35:52] [config] exponential-smoothing: 0.0001 [2021-04-02 07:35:52] [config] factor-weight: 1 [2021-04-02 07:35:52] [config] grad-dropping-momentum: 0 [2021-04-02 07:35:52] [config] grad-dropping-rate: 0 [2021-04-02 07:35:52] [config] grad-dropping-warmup: 100 [2021-04-02 07:35:52] [config] gradient-checkpointing: false [2021-04-02 07:35:52] [config] guided-alignment: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-02 07:35:52] [config] guided-alignment-cost: mse [2021-04-02 07:35:52] [config] guided-alignment-weight: 0.1 [2021-04-02 07:35:52] [config] ignore-model-config: false [2021-04-02 07:35:52] [config] input-types: [2021-04-02 07:35:52] [config] [] [2021-04-02 07:35:52] [config] interpolate-env-vars: false [2021-04-02 07:35:52] [config] keep-best: true [2021-04-02 07:35:52] [config] label-smoothing: 0.1 [2021-04-02 07:35:52] [config] layer-normalization: false [2021-04-02 07:35:52] [config] learn-rate: 0.0003 [2021-04-02 07:35:52] [config] lemma-dim-emb: 0 [2021-04-02 07:35:52] [config] log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log [2021-04-02 07:35:52] [config] log-level: info [2021-04-02 07:35:52] [config] log-time-zone: "" [2021-04-02 07:35:52] [config] logical-epoch: [2021-04-02 07:35:52] [config] - 1e [2021-04-02 07:35:52] [config] - 0 [2021-04-02 07:35:52] [config] lr-decay: 0 [2021-04-02 07:35:52] [config] lr-decay-freq: 50000 [2021-04-02 07:35:52] [config] lr-decay-inv-sqrt: [2021-04-02 07:35:52] [config] - 16000 [2021-04-02 07:35:52] [config] lr-decay-repeat-warmup: false [2021-04-02 07:35:52] [config] lr-decay-reset-optimizer: false [2021-04-02 07:35:52] [config] lr-decay-start: [2021-04-02 07:35:52] [config] - 10 [2021-04-02 07:35:52] [config] - 1 [2021-04-02 07:35:52] [config] lr-decay-strategy: epoch+stalled [2021-04-02 07:35:52] [config] lr-report: true [2021-04-02 07:35:52] [config] lr-warmup: 16000 [2021-04-02 07:35:52] [config] lr-warmup-at-reload: false [2021-04-02 07:35:52] [config] lr-warmup-cycle: false [2021-04-02 07:35:52] [config] lr-warmup-start-rate: 0 [2021-04-02 07:35:52] [config] max-length: 500 [2021-04-02 07:35:52] [config] max-length-crop: false [2021-04-02 07:35:52] [config] max-length-factor: 3 [2021-04-02 07:35:52] [config] maxi-batch: 500 [2021-04-02 07:35:52] [config] maxi-batch-sort: trg [2021-04-02 07:35:52] [config] mini-batch: 64 [2021-04-02 07:35:52] [config] mini-batch-fit: true [2021-04-02 07:35:52] [config] mini-batch-fit-step: 10 [2021-04-02 07:35:52] [config] mini-batch-track-lr: false [2021-04-02 07:35:52] [config] mini-batch-warmup: 0 [2021-04-02 07:35:52] [config] mini-batch-words: 0 [2021-04-02 07:35:52] [config] mini-batch-words-ref: 0 [2021-04-02 07:35:52] [config] model: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-02 07:35:52] [config] multi-loss-type: sum [2021-04-02 07:35:52] [config] multi-node: false [2021-04-02 07:35:52] [config] multi-node-overlap: true [2021-04-02 07:35:52] [config] n-best: false [2021-04-02 07:35:52] [config] no-nccl: false [2021-04-02 07:35:52] [config] no-reload: false [2021-04-02 07:35:52] [config] no-restore-corpus: false [2021-04-02 07:35:52] [config] normalize: 1 [2021-04-02 07:35:52] [config] normalize-gradient: false [2021-04-02 07:35:52] [config] num-devices: 0 [2021-04-02 07:35:52] [config] optimizer: adam [2021-04-02 07:35:52] [config] optimizer-delay: 1 [2021-04-02 07:35:52] [config] optimizer-params: [2021-04-02 07:35:52] [config] - 0.9 [2021-04-02 07:35:52] [config] - 0.98 [2021-04-02 07:35:52] [config] - 1e-09 [2021-04-02 07:35:52] [config] output-omit-bias: false [2021-04-02 07:35:52] [config] overwrite: true [2021-04-02 07:35:52] [config] precision: [2021-04-02 07:35:52] [config] - float16 [2021-04-02 07:35:52] [config] - float32 [2021-04-02 07:35:52] [config] - float32 [2021-04-02 07:35:52] [config] pretrained-model: "" [2021-04-02 07:35:52] [config] quantize-biases: false [2021-04-02 07:35:52] [config] quantize-bits: 0 [2021-04-02 07:35:52] [config] quantize-log-based: false [2021-04-02 07:35:52] [config] quantize-optimization-steps: 0 [2021-04-02 07:35:52] [config] quiet: false [2021-04-02 07:35:52] [config] quiet-translation: false [2021-04-02 07:35:52] [config] relative-paths: false [2021-04-02 07:35:52] [config] right-left: false [2021-04-02 07:35:52] [config] save-freq: 10000 [2021-04-02 07:35:52] [config] seed: 1111 [2021-04-02 07:35:52] [config] sentencepiece-alphas: [2021-04-02 07:35:52] [config] [] [2021-04-02 07:35:52] [config] sentencepiece-max-lines: 2000000 [2021-04-02 07:35:52] [config] sentencepiece-options: "" [2021-04-02 07:35:52] [config] shuffle: data [2021-04-02 07:35:52] [config] shuffle-in-ram: false [2021-04-02 07:35:52] [config] sigterm: save-and-exit [2021-04-02 07:35:52] [config] skip: false [2021-04-02 07:35:52] [config] sqlite: temporary [2021-04-02 07:35:52] [config] sqlite-drop: false [2021-04-02 07:35:52] [config] sync-sgd: true [2021-04-02 07:35:52] [config] tempdir: /run/nvme/job_5389725/data [2021-04-02 07:35:52] [config] tied-embeddings: false [2021-04-02 07:35:52] [config] tied-embeddings-all: true [2021-04-02 07:35:52] [config] tied-embeddings-src: false [2021-04-02 07:35:52] [config] train-embedder-rank: [2021-04-02 07:35:52] [config] [] [2021-04-02 07:35:52] [config] train-sets: [2021-04-02 07:35:52] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz [2021-04-02 07:35:52] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz [2021-04-02 07:35:52] [config] transformer-aan-activation: swish [2021-04-02 07:35:52] [config] transformer-aan-depth: 2 [2021-04-02 07:35:52] [config] transformer-aan-nogate: false [2021-04-02 07:35:52] [config] transformer-decoder-autoreg: self-attention [2021-04-02 07:35:52] [config] transformer-depth-scaling: false [2021-04-02 07:35:52] [config] transformer-dim-aan: 2048 [2021-04-02 07:35:52] [config] transformer-dim-ffn: 2048 [2021-04-02 07:35:52] [config] transformer-dropout: 0.1 [2021-04-02 07:35:52] [config] transformer-dropout-attention: 0 [2021-04-02 07:35:52] [config] transformer-dropout-ffn: 0 [2021-04-02 07:35:52] [config] transformer-ffn-activation: swish [2021-04-02 07:35:52] [config] transformer-ffn-depth: 2 [2021-04-02 07:35:52] [config] transformer-guided-alignment-layer: last [2021-04-02 07:35:52] [config] transformer-heads: 8 [2021-04-02 07:35:52] [config] transformer-no-projection: false [2021-04-02 07:35:52] [config] transformer-pool: false [2021-04-02 07:35:52] [config] transformer-postprocess: dan [2021-04-02 07:35:52] [config] transformer-postprocess-emb: d [2021-04-02 07:35:52] [config] transformer-postprocess-top: "" [2021-04-02 07:35:52] [config] transformer-preprocess: "" [2021-04-02 07:35:52] [config] transformer-tied-layers: [2021-04-02 07:35:52] [config] [] [2021-04-02 07:35:52] [config] transformer-train-position-embeddings: false [2021-04-02 07:35:52] [config] tsv: false [2021-04-02 07:35:52] [config] tsv-fields: 0 [2021-04-02 07:35:52] [config] type: transformer [2021-04-02 07:35:52] [config] ulr: false [2021-04-02 07:35:52] [config] ulr-dim-emb: 0 [2021-04-02 07:35:52] [config] ulr-dropout: 0 [2021-04-02 07:35:52] [config] ulr-keys-vectors: "" [2021-04-02 07:35:52] [config] ulr-query-vectors: "" [2021-04-02 07:35:52] [config] ulr-softmax-temperature: 1 [2021-04-02 07:35:52] [config] ulr-trainable-transformation: false [2021-04-02 07:35:52] [config] unlikelihood-loss: false [2021-04-02 07:35:52] [config] valid-freq: 10000 [2021-04-02 07:35:52] [config] valid-log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log [2021-04-02 07:35:52] [config] valid-max-length: 1000 [2021-04-02 07:35:52] [config] valid-metrics: [2021-04-02 07:35:52] [config] - perplexity [2021-04-02 07:35:52] [config] valid-mini-batch: 16 [2021-04-02 07:35:52] [config] valid-reset-stalled: false [2021-04-02 07:35:52] [config] valid-script-args: [2021-04-02 07:35:52] [config] [] [2021-04-02 07:35:52] [config] valid-script-path: "" [2021-04-02 07:35:52] [config] valid-sets: [2021-04-02 07:35:52] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k [2021-04-02 07:35:52] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k [2021-04-02 07:35:52] [config] valid-translation-output: "" [2021-04-02 07:35:52] [config] version: v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-02 07:35:52] [config] vocabs: [2021-04-02 07:35:52] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-02 07:35:52] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-02 07:35:52] [config] word-penalty: 0 [2021-04-02 07:35:52] [config] word-scores: false [2021-04-02 07:35:52] [config] workspace: 24000 [2021-04-02 07:35:52] [config] Loaded model has been created with Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-02 07:35:52] Using synchronous SGD [2021-04-02 07:35:52] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-02 07:35:52] [data] Setting vocabulary size for input 0 to 56,521 [2021-04-02 07:35:52] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-02 07:35:52] [data] Setting vocabulary size for input 1 to 56,521 [2021-04-02 07:35:52] [data] Using word alignments from file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-02 07:35:52] [sqlite] Creating temporary database in /run/nvme/job_5389725/data [2021-04-02 07:35:56] [sqlite] Inserted 1000000 lines [2021-04-02 07:35:59] [sqlite] Inserted 2000000 lines [2021-04-02 07:36:04] [sqlite] Inserted 4000000 lines [2021-04-02 07:36:16] [sqlite] Inserted 8000000 lines [2021-04-02 07:36:40] [sqlite] Inserted 16000000 lines [2021-04-02 07:37:26] [sqlite] Inserted 32000000 lines [2021-04-02 07:38:42] [sqlite] Inserted 64000000 lines [2021-04-02 07:39:34] [sqlite] Inserted 82557593 lines [2021-04-02 07:39:34] [sqlite] Creating primary index [2021-04-02 07:40:12] [comm] Compiled without MPI support. Running as a single process on r14g04.bullx [2021-04-02 07:40:12] [batching] Collecting statistics for batch fitting with step size 10 [2021-04-02 07:40:21] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-02 07:40:21] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-02 07:40:22] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-02 07:40:22] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-02 07:40:22] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-02 07:40:24] [comm] NCCLCommunicator constructed successfully [2021-04-02 07:40:24] [training] Using 4 GPUs [2021-04-02 07:40:24] [logits] Applying loss function for 1 factor(s) [2021-04-02 07:40:24] [memory] Reserving 278 MB, device gpu0 [2021-04-02 07:40:24] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2021-04-02 07:40:24] [memory] Reserving 278 MB, device gpu0 [2021-04-02 07:43:16] [batching] Done. Typical MB size is 58,832 target words [2021-04-02 07:43:17] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-02 07:43:17] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-02 07:43:17] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-02 07:43:17] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-02 07:43:17] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-02 07:43:17] [comm] NCCLCommunicator constructed successfully [2021-04-02 07:43:17] [training] Using 4 GPUs [2021-04-02 07:43:17] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-02 07:43:18] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-02 07:43:19] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-02 07:43:19] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-02 07:43:20] Loading Adam parameters from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-02 07:43:21] [memory] Reserving 139 MB, device gpu0 [2021-04-02 07:43:21] [memory] Reserving 139 MB, device gpu1 [2021-04-02 07:43:21] [memory] Reserving 139 MB, device gpu2 [2021-04-02 07:43:21] [memory] Reserving 139 MB, device gpu3 [2021-04-02 07:43:21] [training] Model reloaded from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-02 07:43:21] [data] Restoring the corpus state to epoch 1, batch 60000 [2021-04-02 07:43:21] [sqlite] Selecting shuffled data [2021-04-02 08:04:07] Training started [2021-04-02 08:04:07] [training] Batches are processed as 1 process(es) x 4 devices/process [2021-04-02 08:04:07] [memory] Reserving 278 MB, device gpu0 [2021-04-02 08:04:07] [memory] Reserving 278 MB, device gpu1 [2021-04-02 08:04:07] [memory] Reserving 278 MB, device gpu2 [2021-04-02 08:04:07] [memory] Reserving 278 MB, device gpu3 [2021-04-02 08:04:08] [memory] Reserving 278 MB, device gpu0 [2021-04-02 08:04:08] [memory] Reserving 278 MB, device gpu2 [2021-04-02 08:04:08] [memory] Reserving 278 MB, device gpu1 [2021-04-02 08:04:08] [memory] Reserving 278 MB, device gpu3 [2021-04-02 08:04:08] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-02 08:04:10] [memory] Reserving 278 MB, device cpu0 [2021-04-02 08:04:10] [memory] Reserving 69 MB, device gpu0 [2021-04-02 08:04:10] [memory] Reserving 69 MB, device gpu1 [2021-04-02 08:04:10] [memory] Reserving 69 MB, device gpu2 [2021-04-02 08:04:10] [memory] Reserving 69 MB, device gpu3 [2021-04-02 10:08:51] Ep. 1 : Up. 70000 : Sen. 65,294,576 : Cost 0.35751390 * 1,368,872,489 @ 15,620 after 9,594,025,475 : Time 8734.02s : 21345.11 words/s : L.r. 1.4343e-04 [2021-04-02 10:08:51] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-02 10:08:52] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-02 10:08:54] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-02 10:09:01] [valid] Ep. 1 : Up. 70000 : perplexity : 1.89669 : stalled 5 times (last best: 1.88848) [2021-04-05 16:26:10] [marian] Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-05 16:26:10] [marian] Running on r13g02.bullx as process 126297 with command line: [2021-04-05 16:26:10] [marian] /projappl/project_2001194/marian/build/marian --guided-alignment /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz --early-stopping 15 --valid-freq 10000 --valid-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k --valid-metrics perplexity --valid-mini-batch 16 --valid-log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log --beam-size 12 --normalize 1 --allow-unk --overwrite --keep-best --model /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz --type transformer --train-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz --max-length 500 --vocabs /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml --mini-batch-fit -w 24000 --maxi-batch 500 --save-freq 10000 --disp-freq 10000 --log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --fp16 --tied-embeddings-all --devices 0 1 2 3 --sync-sgd --seed 1111 --sqlite --tempdir /run/nvme/job_5404270/data --exponential-smoothing [2021-04-05 16:26:11] [config] after: 0e [2021-04-05 16:26:11] [config] after-batches: 0 [2021-04-05 16:26:11] [config] after-epochs: 0 [2021-04-05 16:26:11] [config] all-caps-every: 0 [2021-04-05 16:26:11] [config] allow-unk: true [2021-04-05 16:26:11] [config] authors: false [2021-04-05 16:26:11] [config] beam-size: 12 [2021-04-05 16:26:11] [config] bert-class-symbol: "[CLS]" [2021-04-05 16:26:11] [config] bert-mask-symbol: "[MASK]" [2021-04-05 16:26:11] [config] bert-masking-fraction: 0.15 [2021-04-05 16:26:11] [config] bert-sep-symbol: "[SEP]" [2021-04-05 16:26:11] [config] bert-train-type-embeddings: true [2021-04-05 16:26:11] [config] bert-type-vocab-size: 2 [2021-04-05 16:26:11] [config] build-info: "" [2021-04-05 16:26:11] [config] cite: false [2021-04-05 16:26:11] [config] clip-norm: 5 [2021-04-05 16:26:11] [config] cost-scaling: [2021-04-05 16:26:11] [config] - 7 [2021-04-05 16:26:11] [config] - 2000 [2021-04-05 16:26:11] [config] - 2 [2021-04-05 16:26:11] [config] - 0.05 [2021-04-05 16:26:11] [config] - 10 [2021-04-05 16:26:11] [config] - 1 [2021-04-05 16:26:11] [config] cost-type: ce-sum [2021-04-05 16:26:11] [config] cpu-threads: 0 [2021-04-05 16:26:11] [config] data-weighting: "" [2021-04-05 16:26:11] [config] data-weighting-type: sentence [2021-04-05 16:26:11] [config] dec-cell: gru [2021-04-05 16:26:11] [config] dec-cell-base-depth: 2 [2021-04-05 16:26:11] [config] dec-cell-high-depth: 1 [2021-04-05 16:26:11] [config] dec-depth: 6 [2021-04-05 16:26:11] [config] devices: [2021-04-05 16:26:11] [config] - 0 [2021-04-05 16:26:11] [config] - 1 [2021-04-05 16:26:11] [config] - 2 [2021-04-05 16:26:11] [config] - 3 [2021-04-05 16:26:11] [config] dim-emb: 512 [2021-04-05 16:26:11] [config] dim-rnn: 1024 [2021-04-05 16:26:11] [config] dim-vocabs: [2021-04-05 16:26:11] [config] - 56521 [2021-04-05 16:26:11] [config] - 56521 [2021-04-05 16:26:11] [config] disp-first: 0 [2021-04-05 16:26:11] [config] disp-freq: 10000 [2021-04-05 16:26:11] [config] disp-label-counts: true [2021-04-05 16:26:11] [config] dropout-rnn: 0 [2021-04-05 16:26:11] [config] dropout-src: 0 [2021-04-05 16:26:11] [config] dropout-trg: 0 [2021-04-05 16:26:11] [config] dump-config: "" [2021-04-05 16:26:11] [config] early-stopping: 15 [2021-04-05 16:26:11] [config] embedding-fix-src: false [2021-04-05 16:26:11] [config] embedding-fix-trg: false [2021-04-05 16:26:11] [config] embedding-normalization: false [2021-04-05 16:26:11] [config] embedding-vectors: [2021-04-05 16:26:11] [config] [] [2021-04-05 16:26:11] [config] enc-cell: gru [2021-04-05 16:26:11] [config] enc-cell-depth: 1 [2021-04-05 16:26:11] [config] enc-depth: 6 [2021-04-05 16:26:11] [config] enc-type: bidirectional [2021-04-05 16:26:11] [config] english-title-case-every: 0 [2021-04-05 16:26:11] [config] exponential-smoothing: 0.0001 [2021-04-05 16:26:11] [config] factor-weight: 1 [2021-04-05 16:26:11] [config] grad-dropping-momentum: 0 [2021-04-05 16:26:11] [config] grad-dropping-rate: 0 [2021-04-05 16:26:11] [config] grad-dropping-warmup: 100 [2021-04-05 16:26:11] [config] gradient-checkpointing: false [2021-04-05 16:26:11] [config] guided-alignment: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-05 16:26:11] [config] guided-alignment-cost: mse [2021-04-05 16:26:11] [config] guided-alignment-weight: 0.1 [2021-04-05 16:26:11] [config] ignore-model-config: false [2021-04-05 16:26:11] [config] input-types: [2021-04-05 16:26:11] [config] [] [2021-04-05 16:26:11] [config] interpolate-env-vars: false [2021-04-05 16:26:11] [config] keep-best: true [2021-04-05 16:26:11] [config] label-smoothing: 0.1 [2021-04-05 16:26:11] [config] layer-normalization: false [2021-04-05 16:26:11] [config] learn-rate: 0.0003 [2021-04-05 16:26:11] [config] lemma-dim-emb: 0 [2021-04-05 16:26:11] [config] log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log [2021-04-05 16:26:11] [config] log-level: info [2021-04-05 16:26:11] [config] log-time-zone: "" [2021-04-05 16:26:11] [config] logical-epoch: [2021-04-05 16:26:11] [config] - 1e [2021-04-05 16:26:11] [config] - 0 [2021-04-05 16:26:11] [config] lr-decay: 0 [2021-04-05 16:26:11] [config] lr-decay-freq: 50000 [2021-04-05 16:26:11] [config] lr-decay-inv-sqrt: [2021-04-05 16:26:11] [config] - 16000 [2021-04-05 16:26:11] [config] lr-decay-repeat-warmup: false [2021-04-05 16:26:11] [config] lr-decay-reset-optimizer: false [2021-04-05 16:26:11] [config] lr-decay-start: [2021-04-05 16:26:11] [config] - 10 [2021-04-05 16:26:11] [config] - 1 [2021-04-05 16:26:11] [config] lr-decay-strategy: epoch+stalled [2021-04-05 16:26:11] [config] lr-report: true [2021-04-05 16:26:11] [config] lr-warmup: 16000 [2021-04-05 16:26:11] [config] lr-warmup-at-reload: false [2021-04-05 16:26:11] [config] lr-warmup-cycle: false [2021-04-05 16:26:11] [config] lr-warmup-start-rate: 0 [2021-04-05 16:26:11] [config] max-length: 500 [2021-04-05 16:26:11] [config] max-length-crop: false [2021-04-05 16:26:11] [config] max-length-factor: 3 [2021-04-05 16:26:11] [config] maxi-batch: 500 [2021-04-05 16:26:11] [config] maxi-batch-sort: trg [2021-04-05 16:26:11] [config] mini-batch: 64 [2021-04-05 16:26:11] [config] mini-batch-fit: true [2021-04-05 16:26:11] [config] mini-batch-fit-step: 10 [2021-04-05 16:26:11] [config] mini-batch-track-lr: false [2021-04-05 16:26:11] [config] mini-batch-warmup: 0 [2021-04-05 16:26:11] [config] mini-batch-words: 0 [2021-04-05 16:26:11] [config] mini-batch-words-ref: 0 [2021-04-05 16:26:11] [config] model: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-05 16:26:11] [config] multi-loss-type: sum [2021-04-05 16:26:11] [config] multi-node: false [2021-04-05 16:26:11] [config] multi-node-overlap: true [2021-04-05 16:26:11] [config] n-best: false [2021-04-05 16:26:11] [config] no-nccl: false [2021-04-05 16:26:11] [config] no-reload: false [2021-04-05 16:26:11] [config] no-restore-corpus: false [2021-04-05 16:26:11] [config] normalize: 1 [2021-04-05 16:26:11] [config] normalize-gradient: false [2021-04-05 16:26:11] [config] num-devices: 0 [2021-04-05 16:26:11] [config] optimizer: adam [2021-04-05 16:26:11] [config] optimizer-delay: 1 [2021-04-05 16:26:11] [config] optimizer-params: [2021-04-05 16:26:11] [config] - 0.9 [2021-04-05 16:26:11] [config] - 0.98 [2021-04-05 16:26:11] [config] - 1e-09 [2021-04-05 16:26:11] [config] output-omit-bias: false [2021-04-05 16:26:11] [config] overwrite: true [2021-04-05 16:26:11] [config] precision: [2021-04-05 16:26:11] [config] - float16 [2021-04-05 16:26:11] [config] - float32 [2021-04-05 16:26:11] [config] - float32 [2021-04-05 16:26:11] [config] pretrained-model: "" [2021-04-05 16:26:11] [config] quantize-biases: false [2021-04-05 16:26:11] [config] quantize-bits: 0 [2021-04-05 16:26:11] [config] quantize-log-based: false [2021-04-05 16:26:11] [config] quantize-optimization-steps: 0 [2021-04-05 16:26:11] [config] quiet: false [2021-04-05 16:26:11] [config] quiet-translation: false [2021-04-05 16:26:11] [config] relative-paths: false [2021-04-05 16:26:11] [config] right-left: false [2021-04-05 16:26:11] [config] save-freq: 10000 [2021-04-05 16:26:11] [config] seed: 1111 [2021-04-05 16:26:11] [config] sentencepiece-alphas: [2021-04-05 16:26:11] [config] [] [2021-04-05 16:26:11] [config] sentencepiece-max-lines: 2000000 [2021-04-05 16:26:11] [config] sentencepiece-options: "" [2021-04-05 16:26:11] [config] shuffle: data [2021-04-05 16:26:11] [config] shuffle-in-ram: false [2021-04-05 16:26:11] [config] sigterm: save-and-exit [2021-04-05 16:26:11] [config] skip: false [2021-04-05 16:26:11] [config] sqlite: temporary [2021-04-05 16:26:11] [config] sqlite-drop: false [2021-04-05 16:26:11] [config] sync-sgd: true [2021-04-05 16:26:11] [config] tempdir: /run/nvme/job_5404270/data [2021-04-05 16:26:11] [config] tied-embeddings: false [2021-04-05 16:26:11] [config] tied-embeddings-all: true [2021-04-05 16:26:11] [config] tied-embeddings-src: false [2021-04-05 16:26:11] [config] train-embedder-rank: [2021-04-05 16:26:11] [config] [] [2021-04-05 16:26:11] [config] train-sets: [2021-04-05 16:26:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz [2021-04-05 16:26:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz [2021-04-05 16:26:11] [config] transformer-aan-activation: swish [2021-04-05 16:26:11] [config] transformer-aan-depth: 2 [2021-04-05 16:26:11] [config] transformer-aan-nogate: false [2021-04-05 16:26:11] [config] transformer-decoder-autoreg: self-attention [2021-04-05 16:26:11] [config] transformer-depth-scaling: false [2021-04-05 16:26:11] [config] transformer-dim-aan: 2048 [2021-04-05 16:26:11] [config] transformer-dim-ffn: 2048 [2021-04-05 16:26:11] [config] transformer-dropout: 0.1 [2021-04-05 16:26:11] [config] transformer-dropout-attention: 0 [2021-04-05 16:26:11] [config] transformer-dropout-ffn: 0 [2021-04-05 16:26:11] [config] transformer-ffn-activation: swish [2021-04-05 16:26:11] [config] transformer-ffn-depth: 2 [2021-04-05 16:26:11] [config] transformer-guided-alignment-layer: last [2021-04-05 16:26:11] [config] transformer-heads: 8 [2021-04-05 16:26:11] [config] transformer-no-projection: false [2021-04-05 16:26:11] [config] transformer-pool: false [2021-04-05 16:26:11] [config] transformer-postprocess: dan [2021-04-05 16:26:11] [config] transformer-postprocess-emb: d [2021-04-05 16:26:11] [config] transformer-postprocess-top: "" [2021-04-05 16:26:11] [config] transformer-preprocess: "" [2021-04-05 16:26:11] [config] transformer-tied-layers: [2021-04-05 16:26:11] [config] [] [2021-04-05 16:26:11] [config] transformer-train-position-embeddings: false [2021-04-05 16:26:11] [config] tsv: false [2021-04-05 16:26:11] [config] tsv-fields: 0 [2021-04-05 16:26:11] [config] type: transformer [2021-04-05 16:26:11] [config] ulr: false [2021-04-05 16:26:11] [config] ulr-dim-emb: 0 [2021-04-05 16:26:11] [config] ulr-dropout: 0 [2021-04-05 16:26:11] [config] ulr-keys-vectors: "" [2021-04-05 16:26:11] [config] ulr-query-vectors: "" [2021-04-05 16:26:11] [config] ulr-softmax-temperature: 1 [2021-04-05 16:26:11] [config] ulr-trainable-transformation: false [2021-04-05 16:26:11] [config] unlikelihood-loss: false [2021-04-05 16:26:11] [config] valid-freq: 10000 [2021-04-05 16:26:11] [config] valid-log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log [2021-04-05 16:26:11] [config] valid-max-length: 1000 [2021-04-05 16:26:11] [config] valid-metrics: [2021-04-05 16:26:11] [config] - perplexity [2021-04-05 16:26:11] [config] valid-mini-batch: 16 [2021-04-05 16:26:11] [config] valid-reset-stalled: false [2021-04-05 16:26:11] [config] valid-script-args: [2021-04-05 16:26:11] [config] [] [2021-04-05 16:26:11] [config] valid-script-path: "" [2021-04-05 16:26:11] [config] valid-sets: [2021-04-05 16:26:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k [2021-04-05 16:26:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k [2021-04-05 16:26:11] [config] valid-translation-output: "" [2021-04-05 16:26:11] [config] version: v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-05 16:26:11] [config] vocabs: [2021-04-05 16:26:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 16:26:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 16:26:11] [config] word-penalty: 0 [2021-04-05 16:26:11] [config] word-scores: false [2021-04-05 16:26:11] [config] workspace: 24000 [2021-04-05 16:26:11] [config] Loaded model has been created with Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-05 16:26:11] Using synchronous SGD [2021-04-05 16:26:11] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 16:26:11] [data] Setting vocabulary size for input 0 to 56,521 [2021-04-05 16:26:11] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 16:26:12] [data] Setting vocabulary size for input 1 to 56,521 [2021-04-05 16:26:12] [data] Using word alignments from file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-05 16:26:12] [sqlite] Creating temporary database in /run/nvme/job_5404270/data [2021-04-05 16:26:15] [sqlite] Inserted 1000000 lines [2021-04-05 16:26:18] [sqlite] Inserted 2000000 lines [2021-04-05 16:26:24] [sqlite] Inserted 4000000 lines [2021-04-05 16:26:36] [sqlite] Inserted 8000000 lines [2021-04-05 16:26:59] [sqlite] Inserted 16000000 lines [2021-04-05 16:27:46] [sqlite] Inserted 32000000 lines [2021-04-05 16:29:02] [sqlite] Inserted 64000000 lines [2021-04-05 16:29:54] [sqlite] Inserted 82557593 lines [2021-04-05 16:29:54] [sqlite] Creating primary index [2021-04-05 16:30:32] [comm] Compiled without MPI support. Running as a single process on r13g02.bullx [2021-04-05 16:30:32] [batching] Collecting statistics for batch fitting with step size 10 [2021-04-05 16:30:40] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-05 16:30:41] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-05 16:30:41] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-05 16:30:42] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-05 16:30:42] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-05 16:30:43] [comm] NCCLCommunicator constructed successfully [2021-04-05 16:30:43] [training] Using 4 GPUs [2021-04-05 16:30:43] [logits] Applying loss function for 1 factor(s) [2021-04-05 16:30:43] [memory] Reserving 278 MB, device gpu0 [2021-04-05 16:30:44] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2021-04-05 16:30:44] [memory] Reserving 278 MB, device gpu0 [2021-04-05 16:33:39] [batching] Done. Typical MB size is 58,832 target words [2021-04-05 16:33:39] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-05 16:33:39] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-05 16:33:39] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-05 16:33:39] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-05 16:33:39] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-05 16:33:41] [comm] NCCLCommunicator constructed successfully [2021-04-05 16:33:41] [training] Using 4 GPUs [2021-04-05 16:33:41] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 16:33:42] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 16:33:42] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 16:33:42] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 16:33:43] Loading Adam parameters from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-05 16:33:44] [memory] Reserving 139 MB, device gpu0 [2021-04-05 16:33:44] [memory] Reserving 139 MB, device gpu1 [2021-04-05 16:33:44] [memory] Reserving 139 MB, device gpu2 [2021-04-05 16:33:44] [memory] Reserving 139 MB, device gpu3 [2021-04-05 16:33:44] [training] Model reloaded from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-05 16:33:44] [data] Restoring the corpus state to epoch 1, batch 70000 [2021-04-05 16:33:44] [sqlite] Selecting shuffled data [2021-04-05 16:57:30] Training started [2021-04-05 16:57:30] [training] Batches are processed as 1 process(es) x 4 devices/process [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu0 [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu1 [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu3 [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu2 [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu0 [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu1 [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu3 [2021-04-05 16:57:30] [memory] Reserving 278 MB, device gpu2 [2021-04-05 16:57:30] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-05 16:57:32] [memory] Reserving 278 MB, device cpu0 [2021-04-05 16:57:32] [memory] Reserving 69 MB, device gpu0 [2021-04-05 16:57:32] [memory] Reserving 69 MB, device gpu1 [2021-04-05 16:57:32] [memory] Reserving 69 MB, device gpu2 [2021-04-05 16:57:32] [memory] Reserving 69 MB, device gpu3 [2021-04-05 19:03:26] Ep. 1 : Up. 80000 : Sen. 74,630,934 : Cost 0.35642993 * 1,371,748,422 @ 19,739 after 10,965,773,897 : Time 8987.29s : 20784.50 words/s : L.r. 1.3416e-04 [2021-04-05 19:03:26] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 19:03:28] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-05 19:03:30] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-05 19:03:37] [valid] Ep. 1 : Up. 80000 : perplexity : 1.89474 : stalled 5 times (last best: 1.88848) [2021-04-05 22:13:04] [marian] Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-05 22:13:04] [marian] Running on r15g03.bullx as process 30651 with command line: [2021-04-05 22:13:04] [marian] /projappl/project_2001194/marian/build/marian --guided-alignment /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz --early-stopping 15 --valid-freq 10000 --valid-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k --valid-metrics perplexity --valid-mini-batch 16 --valid-log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log --beam-size 12 --normalize 1 --allow-unk --overwrite --keep-best --model /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz --type transformer --train-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz --max-length 500 --vocabs /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml --mini-batch-fit -w 24000 --maxi-batch 500 --save-freq 10000 --disp-freq 10000 --log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --fp16 --tied-embeddings-all --devices 0 1 2 3 --sync-sgd --seed 1111 --sqlite --tempdir /run/nvme/job_5405251/data --exponential-smoothing [2021-04-05 22:13:06] [config] after: 0e [2021-04-05 22:13:06] [config] after-batches: 0 [2021-04-05 22:13:06] [config] after-epochs: 0 [2021-04-05 22:13:06] [config] all-caps-every: 0 [2021-04-05 22:13:06] [config] allow-unk: true [2021-04-05 22:13:06] [config] authors: false [2021-04-05 22:13:06] [config] beam-size: 12 [2021-04-05 22:13:06] [config] bert-class-symbol: "[CLS]" [2021-04-05 22:13:06] [config] bert-mask-symbol: "[MASK]" [2021-04-05 22:13:06] [config] bert-masking-fraction: 0.15 [2021-04-05 22:13:06] [config] bert-sep-symbol: "[SEP]" [2021-04-05 22:13:06] [config] bert-train-type-embeddings: true [2021-04-05 22:13:06] [config] bert-type-vocab-size: 2 [2021-04-05 22:13:06] [config] build-info: "" [2021-04-05 22:13:06] [config] cite: false [2021-04-05 22:13:06] [config] clip-norm: 5 [2021-04-05 22:13:06] [config] cost-scaling: [2021-04-05 22:13:06] [config] - 7 [2021-04-05 22:13:06] [config] - 2000 [2021-04-05 22:13:06] [config] - 2 [2021-04-05 22:13:06] [config] - 0.05 [2021-04-05 22:13:06] [config] - 10 [2021-04-05 22:13:06] [config] - 1 [2021-04-05 22:13:06] [config] cost-type: ce-sum [2021-04-05 22:13:06] [config] cpu-threads: 0 [2021-04-05 22:13:06] [config] data-weighting: "" [2021-04-05 22:13:06] [config] data-weighting-type: sentence [2021-04-05 22:13:06] [config] dec-cell: gru [2021-04-05 22:13:06] [config] dec-cell-base-depth: 2 [2021-04-05 22:13:06] [config] dec-cell-high-depth: 1 [2021-04-05 22:13:06] [config] dec-depth: 6 [2021-04-05 22:13:06] [config] devices: [2021-04-05 22:13:06] [config] - 0 [2021-04-05 22:13:06] [config] - 1 [2021-04-05 22:13:06] [config] - 2 [2021-04-05 22:13:06] [config] - 3 [2021-04-05 22:13:06] [config] dim-emb: 512 [2021-04-05 22:13:06] [config] dim-rnn: 1024 [2021-04-05 22:13:06] [config] dim-vocabs: [2021-04-05 22:13:06] [config] - 56521 [2021-04-05 22:13:06] [config] - 56521 [2021-04-05 22:13:06] [config] disp-first: 0 [2021-04-05 22:13:06] [config] disp-freq: 10000 [2021-04-05 22:13:06] [config] disp-label-counts: true [2021-04-05 22:13:06] [config] dropout-rnn: 0 [2021-04-05 22:13:06] [config] dropout-src: 0 [2021-04-05 22:13:06] [config] dropout-trg: 0 [2021-04-05 22:13:06] [config] dump-config: "" [2021-04-05 22:13:06] [config] early-stopping: 15 [2021-04-05 22:13:06] [config] embedding-fix-src: false [2021-04-05 22:13:06] [config] embedding-fix-trg: false [2021-04-05 22:13:06] [config] embedding-normalization: false [2021-04-05 22:13:06] [config] embedding-vectors: [2021-04-05 22:13:06] [config] [] [2021-04-05 22:13:06] [config] enc-cell: gru [2021-04-05 22:13:06] [config] enc-cell-depth: 1 [2021-04-05 22:13:06] [config] enc-depth: 6 [2021-04-05 22:13:06] [config] enc-type: bidirectional [2021-04-05 22:13:06] [config] english-title-case-every: 0 [2021-04-05 22:13:06] [config] exponential-smoothing: 0.0001 [2021-04-05 22:13:06] [config] factor-weight: 1 [2021-04-05 22:13:06] [config] grad-dropping-momentum: 0 [2021-04-05 22:13:06] [config] grad-dropping-rate: 0 [2021-04-05 22:13:06] [config] grad-dropping-warmup: 100 [2021-04-05 22:13:06] [config] gradient-checkpointing: false [2021-04-05 22:13:06] [config] guided-alignment: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-05 22:13:06] [config] guided-alignment-cost: mse [2021-04-05 22:13:06] [config] guided-alignment-weight: 0.1 [2021-04-05 22:13:06] [config] ignore-model-config: false [2021-04-05 22:13:06] [config] input-types: [2021-04-05 22:13:06] [config] [] [2021-04-05 22:13:06] [config] interpolate-env-vars: false [2021-04-05 22:13:06] [config] keep-best: true [2021-04-05 22:13:06] [config] label-smoothing: 0.1 [2021-04-05 22:13:06] [config] layer-normalization: false [2021-04-05 22:13:06] [config] learn-rate: 0.0003 [2021-04-05 22:13:06] [config] lemma-dim-emb: 0 [2021-04-05 22:13:06] [config] log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log [2021-04-05 22:13:06] [config] log-level: info [2021-04-05 22:13:06] [config] log-time-zone: "" [2021-04-05 22:13:06] [config] logical-epoch: [2021-04-05 22:13:06] [config] - 1e [2021-04-05 22:13:06] [config] - 0 [2021-04-05 22:13:06] [config] lr-decay: 0 [2021-04-05 22:13:06] [config] lr-decay-freq: 50000 [2021-04-05 22:13:06] [config] lr-decay-inv-sqrt: [2021-04-05 22:13:06] [config] - 16000 [2021-04-05 22:13:06] [config] lr-decay-repeat-warmup: false [2021-04-05 22:13:06] [config] lr-decay-reset-optimizer: false [2021-04-05 22:13:06] [config] lr-decay-start: [2021-04-05 22:13:06] [config] - 10 [2021-04-05 22:13:06] [config] - 1 [2021-04-05 22:13:06] [config] lr-decay-strategy: epoch+stalled [2021-04-05 22:13:06] [config] lr-report: true [2021-04-05 22:13:06] [config] lr-warmup: 16000 [2021-04-05 22:13:06] [config] lr-warmup-at-reload: false [2021-04-05 22:13:06] [config] lr-warmup-cycle: false [2021-04-05 22:13:06] [config] lr-warmup-start-rate: 0 [2021-04-05 22:13:06] [config] max-length: 500 [2021-04-05 22:13:06] [config] max-length-crop: false [2021-04-05 22:13:06] [config] max-length-factor: 3 [2021-04-05 22:13:06] [config] maxi-batch: 500 [2021-04-05 22:13:06] [config] maxi-batch-sort: trg [2021-04-05 22:13:06] [config] mini-batch: 64 [2021-04-05 22:13:06] [config] mini-batch-fit: true [2021-04-05 22:13:06] [config] mini-batch-fit-step: 10 [2021-04-05 22:13:06] [config] mini-batch-track-lr: false [2021-04-05 22:13:06] [config] mini-batch-warmup: 0 [2021-04-05 22:13:06] [config] mini-batch-words: 0 [2021-04-05 22:13:06] [config] mini-batch-words-ref: 0 [2021-04-05 22:13:06] [config] model: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-05 22:13:06] [config] multi-loss-type: sum [2021-04-05 22:13:06] [config] multi-node: false [2021-04-05 22:13:06] [config] multi-node-overlap: true [2021-04-05 22:13:06] [config] n-best: false [2021-04-05 22:13:06] [config] no-nccl: false [2021-04-05 22:13:06] [config] no-reload: false [2021-04-05 22:13:06] [config] no-restore-corpus: false [2021-04-05 22:13:06] [config] normalize: 1 [2021-04-05 22:13:06] [config] normalize-gradient: false [2021-04-05 22:13:06] [config] num-devices: 0 [2021-04-05 22:13:06] [config] optimizer: adam [2021-04-05 22:13:06] [config] optimizer-delay: 1 [2021-04-05 22:13:06] [config] optimizer-params: [2021-04-05 22:13:06] [config] - 0.9 [2021-04-05 22:13:06] [config] - 0.98 [2021-04-05 22:13:06] [config] - 1e-09 [2021-04-05 22:13:06] [config] output-omit-bias: false [2021-04-05 22:13:06] [config] overwrite: true [2021-04-05 22:13:06] [config] precision: [2021-04-05 22:13:06] [config] - float16 [2021-04-05 22:13:06] [config] - float32 [2021-04-05 22:13:06] [config] - float32 [2021-04-05 22:13:06] [config] pretrained-model: "" [2021-04-05 22:13:06] [config] quantize-biases: false [2021-04-05 22:13:06] [config] quantize-bits: 0 [2021-04-05 22:13:06] [config] quantize-log-based: false [2021-04-05 22:13:06] [config] quantize-optimization-steps: 0 [2021-04-05 22:13:06] [config] quiet: false [2021-04-05 22:13:06] [config] quiet-translation: false [2021-04-05 22:13:06] [config] relative-paths: false [2021-04-05 22:13:06] [config] right-left: false [2021-04-05 22:13:06] [config] save-freq: 10000 [2021-04-05 22:13:06] [config] seed: 1111 [2021-04-05 22:13:06] [config] sentencepiece-alphas: [2021-04-05 22:13:06] [config] [] [2021-04-05 22:13:06] [config] sentencepiece-max-lines: 2000000 [2021-04-05 22:13:06] [config] sentencepiece-options: "" [2021-04-05 22:13:06] [config] shuffle: data [2021-04-05 22:13:06] [config] shuffle-in-ram: false [2021-04-05 22:13:06] [config] sigterm: save-and-exit [2021-04-05 22:13:06] [config] skip: false [2021-04-05 22:13:06] [config] sqlite: temporary [2021-04-05 22:13:06] [config] sqlite-drop: false [2021-04-05 22:13:06] [config] sync-sgd: true [2021-04-05 22:13:06] [config] tempdir: /run/nvme/job_5405251/data [2021-04-05 22:13:06] [config] tied-embeddings: false [2021-04-05 22:13:06] [config] tied-embeddings-all: true [2021-04-05 22:13:06] [config] tied-embeddings-src: false [2021-04-05 22:13:06] [config] train-embedder-rank: [2021-04-05 22:13:06] [config] [] [2021-04-05 22:13:06] [config] train-sets: [2021-04-05 22:13:06] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz [2021-04-05 22:13:06] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz [2021-04-05 22:13:06] [config] transformer-aan-activation: swish [2021-04-05 22:13:06] [config] transformer-aan-depth: 2 [2021-04-05 22:13:06] [config] transformer-aan-nogate: false [2021-04-05 22:13:06] [config] transformer-decoder-autoreg: self-attention [2021-04-05 22:13:06] [config] transformer-depth-scaling: false [2021-04-05 22:13:06] [config] transformer-dim-aan: 2048 [2021-04-05 22:13:06] [config] transformer-dim-ffn: 2048 [2021-04-05 22:13:06] [config] transformer-dropout: 0.1 [2021-04-05 22:13:06] [config] transformer-dropout-attention: 0 [2021-04-05 22:13:06] [config] transformer-dropout-ffn: 0 [2021-04-05 22:13:06] [config] transformer-ffn-activation: swish [2021-04-05 22:13:06] [config] transformer-ffn-depth: 2 [2021-04-05 22:13:06] [config] transformer-guided-alignment-layer: last [2021-04-05 22:13:06] [config] transformer-heads: 8 [2021-04-05 22:13:06] [config] transformer-no-projection: false [2021-04-05 22:13:06] [config] transformer-pool: false [2021-04-05 22:13:06] [config] transformer-postprocess: dan [2021-04-05 22:13:06] [config] transformer-postprocess-emb: d [2021-04-05 22:13:06] [config] transformer-postprocess-top: "" [2021-04-05 22:13:06] [config] transformer-preprocess: "" [2021-04-05 22:13:06] [config] transformer-tied-layers: [2021-04-05 22:13:06] [config] [] [2021-04-05 22:13:06] [config] transformer-train-position-embeddings: false [2021-04-05 22:13:06] [config] tsv: false [2021-04-05 22:13:06] [config] tsv-fields: 0 [2021-04-05 22:13:06] [config] type: transformer [2021-04-05 22:13:06] [config] ulr: false [2021-04-05 22:13:06] [config] ulr-dim-emb: 0 [2021-04-05 22:13:06] [config] ulr-dropout: 0 [2021-04-05 22:13:06] [config] ulr-keys-vectors: "" [2021-04-05 22:13:06] [config] ulr-query-vectors: "" [2021-04-05 22:13:06] [config] ulr-softmax-temperature: 1 [2021-04-05 22:13:06] [config] ulr-trainable-transformation: false [2021-04-05 22:13:06] [config] unlikelihood-loss: false [2021-04-05 22:13:06] [config] valid-freq: 10000 [2021-04-05 22:13:06] [config] valid-log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log [2021-04-05 22:13:06] [config] valid-max-length: 1000 [2021-04-05 22:13:06] [config] valid-metrics: [2021-04-05 22:13:06] [config] - perplexity [2021-04-05 22:13:06] [config] valid-mini-batch: 16 [2021-04-05 22:13:06] [config] valid-reset-stalled: false [2021-04-05 22:13:06] [config] valid-script-args: [2021-04-05 22:13:06] [config] [] [2021-04-05 22:13:06] [config] valid-script-path: "" [2021-04-05 22:13:06] [config] valid-sets: [2021-04-05 22:13:06] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k [2021-04-05 22:13:06] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k [2021-04-05 22:13:06] [config] valid-translation-output: "" [2021-04-05 22:13:06] [config] version: v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-05 22:13:06] [config] vocabs: [2021-04-05 22:13:06] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 22:13:06] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 22:13:06] [config] word-penalty: 0 [2021-04-05 22:13:06] [config] word-scores: false [2021-04-05 22:13:06] [config] workspace: 24000 [2021-04-05 22:13:06] [config] Loaded model has been created with Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-05 22:13:06] Using synchronous SGD [2021-04-05 22:13:06] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 22:13:06] [data] Setting vocabulary size for input 0 to 56,521 [2021-04-05 22:13:06] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-05 22:13:07] [data] Setting vocabulary size for input 1 to 56,521 [2021-04-05 22:13:07] [data] Using word alignments from file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-05 22:13:07] [sqlite] Creating temporary database in /run/nvme/job_5405251/data [2021-04-05 22:13:10] [sqlite] Inserted 1000000 lines [2021-04-05 22:13:13] [sqlite] Inserted 2000000 lines [2021-04-05 22:13:19] [sqlite] Inserted 4000000 lines [2021-04-05 22:13:31] [sqlite] Inserted 8000000 lines [2021-04-05 22:13:55] [sqlite] Inserted 16000000 lines [2021-04-05 22:14:42] [sqlite] Inserted 32000000 lines [2021-04-05 22:15:59] [sqlite] Inserted 64000000 lines [2021-04-05 22:16:51] [sqlite] Inserted 82557593 lines [2021-04-05 22:16:51] [sqlite] Creating primary index [2021-04-05 22:17:29] [comm] Compiled without MPI support. Running as a single process on r15g03.bullx [2021-04-05 22:17:29] [batching] Collecting statistics for batch fitting with step size 10 [2021-04-05 22:17:36] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-05 22:17:37] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-05 22:17:37] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-05 22:17:38] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-05 22:17:38] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-05 22:17:38] [comm] NCCLCommunicator constructed successfully [2021-04-05 22:17:38] [training] Using 4 GPUs [2021-04-05 22:17:39] [logits] Applying loss function for 1 factor(s) [2021-04-05 22:17:39] [memory] Reserving 278 MB, device gpu0 [2021-04-05 22:17:39] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2021-04-05 22:17:39] [memory] Reserving 278 MB, device gpu0 [2021-04-05 22:20:31] [batching] Done. Typical MB size is 58,832 target words [2021-04-05 22:20:32] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-05 22:20:32] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-05 22:20:32] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-05 22:20:32] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-05 22:20:32] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-05 22:20:34] [comm] NCCLCommunicator constructed successfully [2021-04-05 22:20:34] [training] Using 4 GPUs [2021-04-05 22:20:34] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 22:20:34] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 22:20:35] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 22:20:35] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-05 22:20:36] Loading Adam parameters from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-05 22:20:37] [memory] Reserving 139 MB, device gpu0 [2021-04-05 22:20:37] [memory] Reserving 139 MB, device gpu1 [2021-04-05 22:20:37] [memory] Reserving 139 MB, device gpu2 [2021-04-05 22:20:37] [memory] Reserving 139 MB, device gpu3 [2021-04-05 22:20:37] [training] Model reloaded from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-05 22:20:37] [data] Restoring the corpus state to epoch 1, batch 80000 [2021-04-05 22:20:37] [sqlite] Selecting shuffled data [2021-04-05 22:48:04] Training started [2021-04-05 22:48:04] [training] Batches are processed as 1 process(es) x 4 devices/process [2021-04-05 22:48:04] [memory] Reserving 278 MB, device gpu0 [2021-04-05 22:48:04] [memory] Reserving 278 MB, device gpu1 [2021-04-05 22:48:04] [memory] Reserving 278 MB, device gpu3 [2021-04-05 22:48:04] [memory] Reserving 278 MB, device gpu2 [2021-04-05 22:48:05] [memory] Reserving 278 MB, device gpu0 [2021-04-05 22:48:05] [memory] Reserving 278 MB, device gpu1 [2021-04-05 22:48:05] [memory] Reserving 278 MB, device gpu3 [2021-04-05 22:48:05] [memory] Reserving 278 MB, device gpu2 [2021-04-05 22:48:05] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-05 22:48:07] [memory] Reserving 278 MB, device cpu0 [2021-04-05 22:48:07] [memory] Reserving 69 MB, device gpu0 [2021-04-05 22:48:07] [memory] Reserving 69 MB, device gpu1 [2021-04-05 22:48:07] [memory] Reserving 69 MB, device gpu2 [2021-04-05 22:48:07] [memory] Reserving 69 MB, device gpu3 [2021-04-06 00:34:20] Seen 82557593 samples [2021-04-06 00:34:20] Starting data epoch 2 in logical epoch 2 [2021-04-06 00:34:20] [sqlite] Selecting shuffled data [2021-04-06 00:54:23] Ep. 2 : Up. 90000 : Sen. 1,395,076 : Cost 0.35459033 * 1,372,777,191 @ 21,989 after 12,338,551,088 : Time 9231.02s : 20198.31 words/s : L.r. 1.2649e-04 [2021-04-06 00:54:23] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 00:54:24] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 00:54:26] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 00:54:33] [valid] Ep. 2 : Up. 90000 : perplexity : 1.89394 : stalled 5 times (last best: 1.88848) [2021-04-06 02:59:51] Ep. 2 : Up. 100000 : Sen. 10,747,333 : Cost 0.35472596 * 1,371,741,899 @ 21,833 after 13,710,292,987 : Time 7528.44s : 24841.14 words/s : L.r. 1.2000e-04 [2021-04-06 02:59:51] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 02:59:53] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 02:59:55] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 03:00:01] [valid] Ep. 2 : Up. 100000 : perplexity : 1.8925 : stalled 6 times (last best: 1.88848) [2021-04-06 05:05:08] Ep. 2 : Up. 110000 : Sen. 20,079,143 : Cost 0.35357547 * 1,371,728,991 @ 15,078 after 15,082,021,978 : Time 7516.27s : 24835.99 words/s : L.r. 1.1442e-04 [2021-04-06 05:05:08] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 05:05:09] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 05:05:11] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 05:05:17] [valid] Ep. 2 : Up. 110000 : perplexity : 1.89135 : stalled 7 times (last best: 1.88848) [2021-04-06 07:10:14] Ep. 2 : Up. 120000 : Sen. 29,409,440 : Cost 0.35357752 * 1,369,404,264 @ 26,466 after 16,451,426,242 : Time 7505.95s : 24861.68 words/s : L.r. 1.0954e-04 [2021-04-06 07:10:14] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 07:10:15] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 07:10:17] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 07:10:23] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-06 07:10:24] [valid] Ep. 2 : Up. 120000 : perplexity : 1.88845 : new best [2021-04-06 09:14:54] Ep. 2 : Up. 130000 : Sen. 38,697,156 : Cost 0.35267699 * 1,364,707,125 @ 17,640 after 17,816,133,367 : Time 7480.50s : 24823.64 words/s : L.r. 1.0525e-04 [2021-04-06 09:14:54] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 09:14:56] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 09:14:58] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 09:15:04] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-06 09:15:05] [valid] Ep. 2 : Up. 130000 : perplexity : 1.88556 : new best [2021-04-06 11:20:10] Ep. 2 : Up. 140000 : Sen. 48,017,614 : Cost 0.35200080 * 1,371,359,175 @ 16,513 after 19,187,492,542 : Time 7516.15s : 24816.42 words/s : L.r. 1.0142e-04 [2021-04-06 11:20:11] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 11:20:13] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 11:20:14] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 11:20:21] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-06 11:20:22] [valid] Ep. 2 : Up. 140000 : perplexity : 1.88312 : new best [2021-04-06 13:25:24] Ep. 2 : Up. 150000 : Sen. 57,344,000 : Cost 0.35159644 * 1,371,818,233 @ 22,048 after 20,559,310,775 : Time 7513.17s : 24825.95 words/s : L.r. 9.7980e-05 [2021-04-06 13:25:24] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 13:25:25] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 13:25:27] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 13:25:34] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-06 13:25:34] [valid] Ep. 2 : Up. 150000 : perplexity : 1.88199 : new best [2021-04-06 15:30:48] Ep. 2 : Up. 160000 : Sen. 66,694,488 : Cost 0.35188574 * 1,373,447,363 @ 25,688 after 21,932,758,138 : Time 7524.55s : 24855.92 words/s : L.r. 9.4868e-05 [2021-04-06 15:30:48] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 15:30:50] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 15:30:54] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 15:31:00] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-06 15:31:01] [valid] Ep. 2 : Up. 160000 : perplexity : 1.88169 : new best [2021-04-06 17:35:57] Ep. 2 : Up. 170000 : Sen. 76,012,754 : Cost 0.35108033 * 1,371,155,929 @ 29,333 after 23,303,914,067 : Time 7509.10s : 24830.02 words/s : L.r. 9.2036e-05 [2021-04-06 17:35:58] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 17:35:59] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 17:36:01] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 17:36:07] [valid] Ep. 2 : Up. 170000 : perplexity : 1.88365 : stalled 1 times (last best: 1.88169) [2021-04-06 19:03:47] Seen 82557593 samples [2021-04-06 19:03:47] Starting data epoch 3 in logical epoch 3 [2021-04-06 19:03:47] [sqlite] Selecting shuffled data [2021-04-06 19:42:45] Ep. 3 : Up. 180000 : Sen. 2,814,540 : Cost 0.35097939 * 1,375,362,860 @ 10,818 after 24,679,276,927 : Time 7607.77s : 24604.31 words/s : L.r. 8.9443e-05 [2021-04-06 19:42:45] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 19:42:47] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 19:42:49] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 19:42:55] [valid] Ep. 3 : Up. 180000 : perplexity : 1.88647 : stalled 2 times (last best: 1.88169) [2021-04-06 21:48:26] Ep. 3 : Up. 190000 : Sen. 12,176,755 : Cost 0.35023901 * 1,375,043,228 @ 33,064 after 26,054,320,155 : Time 7540.19s : 24811.46 words/s : L.r. 8.7057e-05 [2021-04-06 21:48:26] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 21:48:28] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 21:48:30] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 21:48:36] [valid] Ep. 3 : Up. 190000 : perplexity : 1.88655 : stalled 3 times (last best: 1.88169) [2021-04-06 23:53:21] Ep. 3 : Up. 200000 : Sen. 21,464,672 : Cost 0.34930992 * 1,370,111,171 @ 14,399 after 27,424,431,326 : Time 7495.03s : 24804.91 words/s : L.r. 8.4853e-05 [2021-04-06 23:53:21] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-06 23:53:23] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-06 23:53:25] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-06 23:53:32] [valid] Ep. 3 : Up. 200000 : perplexity : 1.88541 : stalled 4 times (last best: 1.88169) [2021-04-07 01:58:47] Ep. 3 : Up. 210000 : Sen. 30,804,749 : Cost 0.35023621 * 1,372,880,861 @ 26,832 after 28,797,312,187 : Time 7525.66s : 24833.85 words/s : L.r. 8.2808e-05 [2021-04-07 01:58:47] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 01:58:49] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 01:58:51] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 01:58:58] [valid] Ep. 3 : Up. 210000 : perplexity : 1.88435 : stalled 5 times (last best: 1.88169) [2021-04-07 04:03:47] Ep. 3 : Up. 220000 : Sen. 40,113,921 : Cost 0.35043707 * 1,366,467,356 @ 21,052 after 30,163,779,543 : Time 7499.39s : 24819.44 words/s : L.r. 8.0904e-05 [2021-04-07 04:03:47] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 04:03:48] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 04:03:50] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 04:03:56] [valid] Ep. 3 : Up. 220000 : perplexity : 1.8836 : stalled 6 times (last best: 1.88169) [2021-04-07 06:09:16] Ep. 3 : Up. 230000 : Sen. 49,465,414 : Cost 0.35020769 * 1,374,048,660 @ 14,109 after 31,537,828,203 : Time 7528.88s : 24846.31 words/s : L.r. 7.9126e-05 [2021-04-07 06:09:16] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 06:09:17] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 06:09:19] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 06:09:25] [valid] Ep. 3 : Up. 230000 : perplexity : 1.88213 : stalled 7 times (last best: 1.88169) [2021-04-07 08:14:28] Ep. 3 : Up. 240000 : Sen. 58,791,228 : Cost 0.34957245 * 1,372,260,912 @ 32,543 after 32,910,089,115 : Time 7511.82s : 24839.05 words/s : L.r. 7.7460e-05 [2021-04-07 08:14:28] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 08:14:30] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 08:14:33] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 08:14:39] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-07 08:14:40] [valid] Ep. 3 : Up. 240000 : perplexity : 1.88113 : new best [2021-04-07 10:19:44] Ep. 3 : Up. 250000 : Sen. 68,125,622 : Cost 0.34948787 * 1,372,332,503 @ 15,848 after 34,282,421,618 : Time 7515.61s : 24837.95 words/s : L.r. 7.5895e-05 [2021-04-07 10:19:44] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 10:19:46] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 10:19:47] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 10:19:53] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-07 10:19:54] [valid] Ep. 3 : Up. 250000 : perplexity : 1.87992 : new best [2021-04-07 12:25:07] Ep. 3 : Up. 260000 : Sen. 77,475,712 : Cost 0.35004127 * 1,371,624,063 @ 14,363 after 35,654,045,681 : Time 7523.06s : 24857.20 words/s : L.r. 7.4421e-05 [2021-04-07 12:25:07] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 12:25:09] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 12:25:11] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 12:25:18] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-07 12:25:19] [valid] Ep. 3 : Up. 260000 : perplexity : 1.87875 : new best [2021-04-07 13:33:33] Seen 82557593 samples [2021-04-07 13:33:33] Starting data epoch 4 in logical epoch 4 [2021-04-07 13:33:33] [sqlite] Selecting shuffled data [2021-04-07 14:31:33] Ep. 4 : Up. 270000 : Sen. 4,225,217 : Cost 0.34860352 * 1,370,842,425 @ 14,045 after 37,024,888,106 : Time 7585.67s : 24556.76 words/s : L.r. 7.3030e-05 [2021-04-07 14:31:33] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 14:31:35] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 14:31:36] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 14:31:43] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-07 14:31:43] [valid] Ep. 4 : Up. 270000 : perplexity : 1.87816 : new best [2021-04-07 16:36:34] Ep. 4 : Up. 280000 : Sen. 13,545,285 : Cost 0.34874994 * 1,368,752,594 @ 14,516 after 38,393,640,700 : Time 7501.49s : 24837.93 words/s : L.r. 7.1714e-05 [2021-04-07 16:36:34] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 16:36:36] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 16:36:38] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 16:36:44] [valid] Ep. 4 : Up. 280000 : perplexity : 1.87835 : stalled 1 times (last best: 1.87816) [2021-04-07 18:41:15] Ep. 4 : Up. 290000 : Sen. 22,830,793 : Cost 0.34806466 * 1,367,458,464 @ 24,842 after 39,761,099,164 : Time 7480.51s : 24824.17 words/s : L.r. 7.0466e-05 [2021-04-07 18:41:15] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 18:41:17] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 18:41:19] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 18:41:25] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-07 18:41:26] [valid] Ep. 4 : Up. 290000 : perplexity : 1.87794 : new best [2021-04-07 20:46:36] Ep. 4 : Up. 300000 : Sen. 32,173,629 : Cost 0.34932822 * 1,371,051,040 @ 12,372 after 41,132,150,204 : Time 7521.32s : 24843.56 words/s : L.r. 6.9282e-05 [2021-04-07 20:46:36] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 20:46:38] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 20:46:40] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 20:46:46] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-07 20:46:47] [valid] Ep. 4 : Up. 300000 : perplexity : 1.87728 : new best [2021-04-07 22:51:57] Ep. 4 : Up. 310000 : Sen. 41,508,040 : Cost 0.34873691 * 1,372,658,704 @ 14,494 after 42,504,808,908 : Time 7521.09s : 24841.64 words/s : L.r. 6.8155e-05 [2021-04-07 22:51:58] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-07 22:51:59] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-07 22:52:01] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-07 22:52:07] [valid] Ep. 4 : Up. 310000 : perplexity : 1.8775 : stalled 1 times (last best: 1.87728) [2021-04-08 00:57:33] Ep. 4 : Up. 320000 : Sen. 50,866,032 : Cost 0.34785625 * 1,378,815,762 @ 15,257 after 43,883,624,670 : Time 7535.45s : 24840.95 words/s : L.r. 6.7082e-05 [2021-04-08 00:57:33] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 00:57:35] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 00:57:36] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 00:57:43] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-08 00:57:43] [valid] Ep. 4 : Up. 320000 : perplexity : 1.87683 : new best [2021-04-08 03:03:02] Ep. 4 : Up. 330000 : Sen. 60,219,058 : Cost 0.34975505 * 1,369,574,092 @ 23,525 after 45,253,198,762 : Time 7529.29s : 24843.00 words/s : L.r. 6.6058e-05 [2021-04-08 03:03:03] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 03:03:04] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 03:03:06] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 03:03:13] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-08 03:03:13] [valid] Ep. 4 : Up. 330000 : perplexity : 1.8766 : new best [2021-04-08 05:08:17] Ep. 4 : Up. 340000 : Sen. 69,550,500 : Cost 0.34815925 * 1,372,940,292 @ 34,122 after 46,626,139,054 : Time 7514.69s : 24826.53 words/s : L.r. 6.5079e-05 [2021-04-08 05:08:17] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 05:08:19] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 05:08:20] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 05:08:27] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-08 05:08:27] [valid] Ep. 4 : Up. 340000 : perplexity : 1.87584 : new best [2021-04-08 07:13:31] Ep. 4 : Up. 350000 : Sen. 78,890,408 : Cost 0.34880802 * 1,371,443,933 @ 10,448 after 47,997,582,987 : Time 7513.37s : 24870.96 words/s : L.r. 6.4143e-05 [2021-04-08 07:13:31] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 07:13:32] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 07:13:34] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 07:13:40] [valid] Ep. 4 : Up. 350000 : perplexity : 1.87712 : stalled 1 times (last best: 1.87584) [2021-04-08 08:02:53] Seen 82557593 samples [2021-04-08 08:02:53] Starting data epoch 5 in logical epoch 5 [2021-04-08 08:02:53] [sqlite] Selecting shuffled data [2021-04-08 09:20:28] Ep. 5 : Up. 360000 : Sen. 5,694,668 : Cost 0.34845334 * 1,373,926,000 @ 10,524 after 49,371,508,987 : Time 7617.07s : 24585.40 words/s : L.r. 6.3246e-05 [2021-04-08 09:20:28] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 09:20:30] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 09:20:31] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 09:20:38] [valid] Ep. 5 : Up. 360000 : perplexity : 1.87757 : stalled 2 times (last best: 1.87584) [2021-04-08 11:25:32] Ep. 5 : Up. 370000 : Sen. 15,006,980 : Cost 0.34731025 * 1,371,236,911 @ 13,139 after 50,742,745,898 : Time 7504.53s : 24823.58 words/s : L.r. 6.2385e-05 [2021-04-08 11:25:32] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 11:25:34] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 11:25:36] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 11:25:42] [valid] Ep. 5 : Up. 370000 : perplexity : 1.87697 : stalled 3 times (last best: 1.87584) [2021-04-08 13:31:15] Ep. 5 : Up. 380000 : Sen. 24,379,228 : Cost 0.34786743 * 1,377,501,860 @ 19,907 after 52,120,247,758 : Time 7542.15s : 24853.78 words/s : L.r. 6.1559e-05 [2021-04-08 13:31:15] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 13:31:16] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 13:31:18] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 13:31:25] [valid] Ep. 5 : Up. 380000 : perplexity : 1.87656 : stalled 4 times (last best: 1.87584) [2021-04-08 15:36:30] Ep. 5 : Up. 390000 : Sen. 33,712,176 : Cost 0.34753197 * 1,372,888,001 @ 25,216 after 53,493,135,759 : Time 7515.68s : 24830.32 words/s : L.r. 6.0764e-05 [2021-04-08 15:36:30] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 15:36:32] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 15:36:34] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 15:36:41] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-08 15:36:41] [valid] Ep. 5 : Up. 390000 : perplexity : 1.87554 : new best [2021-04-08 17:42:06] Ep. 5 : Up. 400000 : Sen. 43,056,935 : Cost 0.34800106 * 1,372,292,179 @ 13,184 after 54,865,427,938 : Time 7535.12s : 24797.89 words/s : L.r. 6.0000e-05 [2021-04-08 17:42:06] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 17:42:08] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 17:42:09] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 17:42:16] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-08 17:42:17] [valid] Ep. 5 : Up. 400000 : perplexity : 1.87514 : new best [2021-04-08 19:47:28] Ep. 5 : Up. 410000 : Sen. 52,397,288 : Cost 0.34771517 * 1,374,756,683 @ 23,329 after 56,240,184,621 : Time 7522.66s : 24857.22 words/s : L.r. 5.9264e-05 [2021-04-08 19:47:29] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 19:47:30] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 19:47:32] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 19:47:39] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-08 19:47:39] [valid] Ep. 5 : Up. 410000 : perplexity : 1.87363 : new best [2021-04-08 21:52:42] Ep. 5 : Up. 420000 : Sen. 61,720,116 : Cost 0.34788588 * 1,370,093,349 @ 38,811 after 57,610,277,970 : Time 7513.78s : 24818.43 words/s : L.r. 5.8554e-05 [2021-04-08 21:52:42] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-08 21:52:44] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-08 21:52:46] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-08 21:52:52] [valid] Ep. 5 : Up. 420000 : perplexity : 1.87405 : stalled 1 times (last best: 1.87363) [2021-04-10 19:07:08] [marian] Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-10 19:07:08] [marian] Running on r04g07.bullx as process 128553 with command line: [2021-04-10 19:07:08] [marian] /projappl/project_2001194/marian/build/marian --guided-alignment /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz --early-stopping 15 --valid-freq 10000 --valid-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k --valid-metrics perplexity --valid-mini-batch 16 --valid-log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log --beam-size 12 --normalize 1 --allow-unk --overwrite --keep-best --model /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz --type transformer --train-sets /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz --max-length 500 --vocabs /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml --mini-batch-fit -w 24000 --maxi-batch 500 --save-freq 10000 --disp-freq 10000 --log /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log --enc-depth 6 --dec-depth 6 --transformer-heads 8 --transformer-postprocess-emb d --transformer-postprocess dan --transformer-dropout 0.1 --label-smoothing 0.1 --learn-rate 0.0003 --lr-warmup 16000 --lr-decay-inv-sqrt 16000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --fp16 --tied-embeddings-all --devices 0 1 2 3 --sync-sgd --seed 1111 --sqlite --tempdir /run/nvme/job_5481176/data --exponential-smoothing [2021-04-10 19:07:11] [config] after: 0e [2021-04-10 19:07:11] [config] after-batches: 0 [2021-04-10 19:07:11] [config] after-epochs: 0 [2021-04-10 19:07:11] [config] all-caps-every: 0 [2021-04-10 19:07:11] [config] allow-unk: true [2021-04-10 19:07:11] [config] authors: false [2021-04-10 19:07:11] [config] beam-size: 12 [2021-04-10 19:07:11] [config] bert-class-symbol: "[CLS]" [2021-04-10 19:07:11] [config] bert-mask-symbol: "[MASK]" [2021-04-10 19:07:11] [config] bert-masking-fraction: 0.15 [2021-04-10 19:07:11] [config] bert-sep-symbol: "[SEP]" [2021-04-10 19:07:11] [config] bert-train-type-embeddings: true [2021-04-10 19:07:11] [config] bert-type-vocab-size: 2 [2021-04-10 19:07:11] [config] build-info: "" [2021-04-10 19:07:11] [config] cite: false [2021-04-10 19:07:11] [config] clip-norm: 5 [2021-04-10 19:07:11] [config] cost-scaling: [2021-04-10 19:07:11] [config] - 7 [2021-04-10 19:07:11] [config] - 2000 [2021-04-10 19:07:11] [config] - 2 [2021-04-10 19:07:11] [config] - 0.05 [2021-04-10 19:07:11] [config] - 10 [2021-04-10 19:07:11] [config] - 1 [2021-04-10 19:07:11] [config] cost-type: ce-sum [2021-04-10 19:07:11] [config] cpu-threads: 0 [2021-04-10 19:07:11] [config] data-weighting: "" [2021-04-10 19:07:11] [config] data-weighting-type: sentence [2021-04-10 19:07:11] [config] dec-cell: gru [2021-04-10 19:07:11] [config] dec-cell-base-depth: 2 [2021-04-10 19:07:11] [config] dec-cell-high-depth: 1 [2021-04-10 19:07:11] [config] dec-depth: 6 [2021-04-10 19:07:11] [config] devices: [2021-04-10 19:07:11] [config] - 0 [2021-04-10 19:07:11] [config] - 1 [2021-04-10 19:07:11] [config] - 2 [2021-04-10 19:07:11] [config] - 3 [2021-04-10 19:07:11] [config] dim-emb: 512 [2021-04-10 19:07:11] [config] dim-rnn: 1024 [2021-04-10 19:07:11] [config] dim-vocabs: [2021-04-10 19:07:11] [config] - 56521 [2021-04-10 19:07:11] [config] - 56521 [2021-04-10 19:07:11] [config] disp-first: 0 [2021-04-10 19:07:11] [config] disp-freq: 10000 [2021-04-10 19:07:11] [config] disp-label-counts: true [2021-04-10 19:07:11] [config] dropout-rnn: 0 [2021-04-10 19:07:11] [config] dropout-src: 0 [2021-04-10 19:07:11] [config] dropout-trg: 0 [2021-04-10 19:07:11] [config] dump-config: "" [2021-04-10 19:07:11] [config] early-stopping: 15 [2021-04-10 19:07:11] [config] embedding-fix-src: false [2021-04-10 19:07:11] [config] embedding-fix-trg: false [2021-04-10 19:07:11] [config] embedding-normalization: false [2021-04-10 19:07:11] [config] embedding-vectors: [2021-04-10 19:07:11] [config] [] [2021-04-10 19:07:11] [config] enc-cell: gru [2021-04-10 19:07:11] [config] enc-cell-depth: 1 [2021-04-10 19:07:11] [config] enc-depth: 6 [2021-04-10 19:07:11] [config] enc-type: bidirectional [2021-04-10 19:07:11] [config] english-title-case-every: 0 [2021-04-10 19:07:11] [config] exponential-smoothing: 0.0001 [2021-04-10 19:07:11] [config] factor-weight: 1 [2021-04-10 19:07:11] [config] grad-dropping-momentum: 0 [2021-04-10 19:07:11] [config] grad-dropping-rate: 0 [2021-04-10 19:07:11] [config] grad-dropping-warmup: 100 [2021-04-10 19:07:11] [config] gradient-checkpointing: false [2021-04-10 19:07:11] [config] guided-alignment: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-10 19:07:11] [config] guided-alignment-cost: mse [2021-04-10 19:07:11] [config] guided-alignment-weight: 0.1 [2021-04-10 19:07:11] [config] ignore-model-config: false [2021-04-10 19:07:11] [config] input-types: [2021-04-10 19:07:11] [config] [] [2021-04-10 19:07:11] [config] interpolate-env-vars: false [2021-04-10 19:07:11] [config] keep-best: true [2021-04-10 19:07:11] [config] label-smoothing: 0.1 [2021-04-10 19:07:11] [config] layer-normalization: false [2021-04-10 19:07:11] [config] learn-rate: 0.0003 [2021-04-10 19:07:11] [config] lemma-dim-emb: 0 [2021-04-10 19:07:11] [config] log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.train1.log [2021-04-10 19:07:11] [config] log-level: info [2021-04-10 19:07:11] [config] log-time-zone: "" [2021-04-10 19:07:11] [config] logical-epoch: [2021-04-10 19:07:11] [config] - 1e [2021-04-10 19:07:11] [config] - 0 [2021-04-10 19:07:11] [config] lr-decay: 0 [2021-04-10 19:07:11] [config] lr-decay-freq: 50000 [2021-04-10 19:07:11] [config] lr-decay-inv-sqrt: [2021-04-10 19:07:11] [config] - 16000 [2021-04-10 19:07:11] [config] lr-decay-repeat-warmup: false [2021-04-10 19:07:11] [config] lr-decay-reset-optimizer: false [2021-04-10 19:07:11] [config] lr-decay-start: [2021-04-10 19:07:11] [config] - 10 [2021-04-10 19:07:11] [config] - 1 [2021-04-10 19:07:11] [config] lr-decay-strategy: epoch+stalled [2021-04-10 19:07:11] [config] lr-report: true [2021-04-10 19:07:11] [config] lr-warmup: 16000 [2021-04-10 19:07:11] [config] lr-warmup-at-reload: false [2021-04-10 19:07:11] [config] lr-warmup-cycle: false [2021-04-10 19:07:11] [config] lr-warmup-start-rate: 0 [2021-04-10 19:07:11] [config] max-length: 500 [2021-04-10 19:07:11] [config] max-length-crop: false [2021-04-10 19:07:11] [config] max-length-factor: 3 [2021-04-10 19:07:11] [config] maxi-batch: 500 [2021-04-10 19:07:11] [config] maxi-batch-sort: trg [2021-04-10 19:07:11] [config] mini-batch: 64 [2021-04-10 19:07:11] [config] mini-batch-fit: true [2021-04-10 19:07:11] [config] mini-batch-fit-step: 10 [2021-04-10 19:07:11] [config] mini-batch-track-lr: false [2021-04-10 19:07:11] [config] mini-batch-warmup: 0 [2021-04-10 19:07:11] [config] mini-batch-words: 0 [2021-04-10 19:07:11] [config] mini-batch-words-ref: 0 [2021-04-10 19:07:11] [config] model: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-10 19:07:11] [config] multi-loss-type: sum [2021-04-10 19:07:11] [config] multi-node: false [2021-04-10 19:07:11] [config] multi-node-overlap: true [2021-04-10 19:07:11] [config] n-best: false [2021-04-10 19:07:11] [config] no-nccl: false [2021-04-10 19:07:11] [config] no-reload: false [2021-04-10 19:07:11] [config] no-restore-corpus: false [2021-04-10 19:07:11] [config] normalize: 1 [2021-04-10 19:07:11] [config] normalize-gradient: false [2021-04-10 19:07:11] [config] num-devices: 0 [2021-04-10 19:07:11] [config] optimizer: adam [2021-04-10 19:07:11] [config] optimizer-delay: 1 [2021-04-10 19:07:11] [config] optimizer-params: [2021-04-10 19:07:11] [config] - 0.9 [2021-04-10 19:07:11] [config] - 0.98 [2021-04-10 19:07:11] [config] - 1e-09 [2021-04-10 19:07:11] [config] output-omit-bias: false [2021-04-10 19:07:11] [config] overwrite: true [2021-04-10 19:07:11] [config] precision: [2021-04-10 19:07:11] [config] - float16 [2021-04-10 19:07:11] [config] - float32 [2021-04-10 19:07:11] [config] - float32 [2021-04-10 19:07:11] [config] pretrained-model: "" [2021-04-10 19:07:11] [config] quantize-biases: false [2021-04-10 19:07:11] [config] quantize-bits: 0 [2021-04-10 19:07:11] [config] quantize-log-based: false [2021-04-10 19:07:11] [config] quantize-optimization-steps: 0 [2021-04-10 19:07:11] [config] quiet: false [2021-04-10 19:07:11] [config] quiet-translation: false [2021-04-10 19:07:11] [config] relative-paths: false [2021-04-10 19:07:11] [config] right-left: false [2021-04-10 19:07:11] [config] save-freq: 10000 [2021-04-10 19:07:11] [config] seed: 1111 [2021-04-10 19:07:11] [config] sentencepiece-alphas: [2021-04-10 19:07:11] [config] [] [2021-04-10 19:07:11] [config] sentencepiece-max-lines: 2000000 [2021-04-10 19:07:11] [config] sentencepiece-options: "" [2021-04-10 19:07:11] [config] shuffle: data [2021-04-10 19:07:11] [config] shuffle-in-ram: false [2021-04-10 19:07:11] [config] sigterm: save-and-exit [2021-04-10 19:07:11] [config] skip: false [2021-04-10 19:07:11] [config] sqlite: temporary [2021-04-10 19:07:11] [config] sqlite-drop: false [2021-04-10 19:07:11] [config] sync-sgd: true [2021-04-10 19:07:11] [config] tempdir: /run/nvme/job_5481176/data [2021-04-10 19:07:11] [config] tied-embeddings: false [2021-04-10 19:07:11] [config] tied-embeddings-all: true [2021-04-10 19:07:11] [config] tied-embeddings-src: false [2021-04-10 19:07:11] [config] train-embedder-rank: [2021-04-10 19:07:11] [config] [] [2021-04-10 19:07:11] [config] train-sets: [2021-04-10 19:07:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.src.clean.spm32k.gz [2021-04-10 19:07:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.trg.clean.spm32k.gz [2021-04-10 19:07:11] [config] transformer-aan-activation: swish [2021-04-10 19:07:11] [config] transformer-aan-depth: 2 [2021-04-10 19:07:11] [config] transformer-aan-nogate: false [2021-04-10 19:07:11] [config] transformer-decoder-autoreg: self-attention [2021-04-10 19:07:11] [config] transformer-depth-scaling: false [2021-04-10 19:07:11] [config] transformer-dim-aan: 2048 [2021-04-10 19:07:11] [config] transformer-dim-ffn: 2048 [2021-04-10 19:07:11] [config] transformer-dropout: 0.1 [2021-04-10 19:07:11] [config] transformer-dropout-attention: 0 [2021-04-10 19:07:11] [config] transformer-dropout-ffn: 0 [2021-04-10 19:07:11] [config] transformer-ffn-activation: swish [2021-04-10 19:07:11] [config] transformer-ffn-depth: 2 [2021-04-10 19:07:11] [config] transformer-guided-alignment-layer: last [2021-04-10 19:07:11] [config] transformer-heads: 8 [2021-04-10 19:07:11] [config] transformer-no-projection: false [2021-04-10 19:07:11] [config] transformer-pool: false [2021-04-10 19:07:11] [config] transformer-postprocess: dan [2021-04-10 19:07:11] [config] transformer-postprocess-emb: d [2021-04-10 19:07:11] [config] transformer-postprocess-top: "" [2021-04-10 19:07:11] [config] transformer-preprocess: "" [2021-04-10 19:07:11] [config] transformer-tied-layers: [2021-04-10 19:07:11] [config] [] [2021-04-10 19:07:11] [config] transformer-train-position-embeddings: false [2021-04-10 19:07:11] [config] tsv: false [2021-04-10 19:07:11] [config] tsv-fields: 0 [2021-04-10 19:07:11] [config] type: transformer [2021-04-10 19:07:11] [config] ulr: false [2021-04-10 19:07:11] [config] ulr-dim-emb: 0 [2021-04-10 19:07:11] [config] ulr-dropout: 0 [2021-04-10 19:07:11] [config] ulr-keys-vectors: "" [2021-04-10 19:07:11] [config] ulr-query-vectors: "" [2021-04-10 19:07:11] [config] ulr-softmax-temperature: 1 [2021-04-10 19:07:11] [config] ulr-trainable-transformation: false [2021-04-10 19:07:11] [config] unlikelihood-loss: false [2021-04-10 19:07:11] [config] valid-freq: 10000 [2021-04-10 19:07:11] [config] valid-log: /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.valid1.log [2021-04-10 19:07:11] [config] valid-max-length: 1000 [2021-04-10 19:07:11] [config] valid-metrics: [2021-04-10 19:07:11] [config] - perplexity [2021-04-10 19:07:11] [config] valid-mini-batch: 16 [2021-04-10 19:07:11] [config] valid-reset-stalled: false [2021-04-10 19:07:11] [config] valid-script-args: [2021-04-10 19:07:11] [config] [] [2021-04-10 19:07:11] [config] valid-script-path: "" [2021-04-10 19:07:11] [config] valid-sets: [2021-04-10 19:07:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.src.spm32k [2021-04-10 19:07:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/val/Tatoeba-dev.trg.spm32k [2021-04-10 19:07:11] [config] valid-translation-output: "" [2021-04-10 19:07:11] [config] version: v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-10 19:07:11] [config] vocabs: [2021-04-10 19:07:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-10 19:07:11] [config] - /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-10 19:07:11] [config] word-penalty: 0 [2021-04-10 19:07:11] [config] word-scores: false [2021-04-10 19:07:11] [config] workspace: 24000 [2021-04-10 19:07:11] [config] Loaded model has been created with Marian v1.10.0 6f6d484 2021-02-06 15:35:16 -0800 [2021-04-10 19:07:11] Using synchronous SGD [2021-04-10 19:07:11] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-10 19:07:12] [data] Setting vocabulary size for input 0 to 56,521 [2021-04-10 19:07:12] [data] Loading vocabulary from JSON/Yaml file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.vocab.yml [2021-04-10 19:07:12] [data] Setting vocabulary size for input 1 to 56,521 [2021-04-10 19:07:12] [data] Using word alignments from file /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/train/opus+bt.spm32k-spm32k.src-trg.alg.gz [2021-04-10 19:07:12] [sqlite] Creating temporary database in /run/nvme/job_5481176/data [2021-04-10 19:07:15] [sqlite] Inserted 1000000 lines [2021-04-10 19:07:18] [sqlite] Inserted 2000000 lines [2021-04-10 19:07:24] [sqlite] Inserted 4000000 lines [2021-04-10 19:07:36] [sqlite] Inserted 8000000 lines [2021-04-10 19:08:00] [sqlite] Inserted 16000000 lines [2021-04-10 19:08:47] [sqlite] Inserted 32000000 lines [2021-04-10 19:10:03] [sqlite] Inserted 64000000 lines [2021-04-10 19:10:55] [sqlite] Inserted 82557593 lines [2021-04-10 19:10:55] [sqlite] Creating primary index [2021-04-10 19:11:33] [comm] Compiled without MPI support. Running as a single process on r04g07.bullx [2021-04-10 19:11:33] [batching] Collecting statistics for batch fitting with step size 10 [2021-04-10 19:11:44] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-10 19:11:45] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-10 19:11:46] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-10 19:11:46] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-10 19:11:46] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-10 19:11:48] [comm] NCCLCommunicator constructed successfully [2021-04-10 19:11:48] [training] Using 4 GPUs [2021-04-10 19:11:48] [logits] Applying loss function for 1 factor(s) [2021-04-10 19:11:48] [memory] Reserving 278 MB, device gpu0 [2021-04-10 19:11:48] [gpu] 16-bit TensorCores enabled for float32 matrix operations [2021-04-10 19:11:50] [memory] Reserving 278 MB, device gpu0 [2021-04-10 19:14:43] [batching] Done. Typical MB size is 58,832 target words [2021-04-10 19:14:43] [memory] Extending reserved space to 24064 MB (device gpu0) [2021-04-10 19:14:44] [memory] Extending reserved space to 24064 MB (device gpu1) [2021-04-10 19:14:44] [memory] Extending reserved space to 24064 MB (device gpu2) [2021-04-10 19:14:44] [memory] Extending reserved space to 24064 MB (device gpu3) [2021-04-10 19:14:44] [comm] Using NCCL 2.8.3 for GPU communication [2021-04-10 19:14:45] [comm] NCCLCommunicator constructed successfully [2021-04-10 19:14:45] [training] Using 4 GPUs [2021-04-10 19:14:45] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-10 19:14:46] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-10 19:14:46] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-10 19:14:47] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-10 19:14:47] Loading Adam parameters from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-10 19:14:49] [memory] Reserving 139 MB, device gpu0 [2021-04-10 19:14:49] [memory] Reserving 139 MB, device gpu1 [2021-04-10 19:14:49] [memory] Reserving 139 MB, device gpu2 [2021-04-10 19:14:49] [memory] Reserving 139 MB, device gpu3 [2021-04-10 19:14:49] [training] Model reloaded from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-10 19:14:49] [data] Restoring the corpus state to epoch 5, batch 420000 [2021-04-10 19:16:39] [sqlite] Selecting shuffled data [2021-04-10 19:39:13] Training started [2021-04-10 19:39:13] [training] Batches are processed as 1 process(es) x 4 devices/process [2021-04-10 19:39:13] [memory] Reserving 278 MB, device gpu0 [2021-04-10 19:39:13] [memory] Reserving 278 MB, device gpu1 [2021-04-10 19:39:13] [memory] Reserving 278 MB, device gpu3 [2021-04-10 19:39:13] [memory] Reserving 278 MB, device gpu2 [2021-04-10 19:39:13] [memory] Reserving 278 MB, device gpu0 [2021-04-10 19:39:14] [memory] Reserving 278 MB, device gpu1 [2021-04-10 19:39:14] [memory] Reserving 278 MB, device gpu3 [2021-04-10 19:39:14] [memory] Reserving 278 MB, device gpu2 [2021-04-10 19:39:14] Loading model from /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-10 19:39:16] [memory] Reserving 278 MB, device cpu0 [2021-04-10 19:39:16] [memory] Reserving 69 MB, device gpu0 [2021-04-10 19:39:16] [memory] Reserving 69 MB, device gpu1 [2021-04-10 19:39:16] [memory] Reserving 69 MB, device gpu2 [2021-04-10 19:39:16] [memory] Reserving 69 MB, device gpu3 [2021-04-10 21:43:57] Ep. 5 : Up. 430000 : Sen. 71,033,988 : Cost 0.34751657 * 1,369,940,309 @ 33,792 after 58,980,218,279 : Time 8953.08s : 20805.39 words/s : L.r. 5.7869e-05 [2021-04-10 21:43:57] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-10 21:43:58] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-10 21:44:00] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-10 21:44:07] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-10 21:44:08] [valid] Ep. 5 : Up. 430000 : perplexity : 1.8733 : new best [2021-04-10 23:48:43] Ep. 5 : Up. 440000 : Sen. 80,329,884 : Cost 0.34763905 * 1,367,525,118 @ 14,514 after 60,347,743,397 : Time 7485.95s : 24840.47 words/s : L.r. 5.7208e-05 [2021-04-10 23:48:43] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-10 23:48:44] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-10 23:48:46] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-10 23:48:52] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-10 23:48:53] [valid] Ep. 5 : Up. 440000 : perplexity : 1.87317 : new best [2021-04-11 00:18:49] Seen 82557593 samples [2021-04-11 00:18:49] Starting data epoch 6 in logical epoch 6 [2021-04-11 00:18:49] [sqlite] Selecting shuffled data [2021-04-11 01:55:44] Ep. 6 : Up. 450000 : Sen. 7,139,228 : Cost 0.34739068 * 1,376,791,898 @ 34,536 after 61,724,535,295 : Time 7621.80s : 24591.22 words/s : L.r. 5.6569e-05 [2021-04-11 01:55:45] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 01:55:46] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 01:55:48] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 01:55:54] [valid] Ep. 6 : Up. 450000 : perplexity : 1.87329 : stalled 1 times (last best: 1.87317) [2021-04-11 04:00:53] Ep. 6 : Up. 460000 : Sen. 16,482,993 : Cost 0.34798861 * 1,370,699,894 @ 15,333 after 63,095,235,189 : Time 7508.53s : 24890.98 words/s : L.r. 5.5950e-05 [2021-04-11 04:00:53] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 04:00:55] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 04:00:56] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 04:01:02] [valid] Ep. 6 : Up. 460000 : perplexity : 1.87395 : stalled 2 times (last best: 1.87317) [2021-04-11 06:06:00] Ep. 6 : Up. 470000 : Sen. 25,799,450 : Cost 0.34730944 * 1,370,411,585 @ 17,977 after 64,465,646,774 : Time 7506.73s : 24836.31 words/s : L.r. 5.5352e-05 [2021-04-11 06:06:00] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 06:06:01] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 06:06:03] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 06:06:09] [valid] Ep. 6 : Up. 470000 : perplexity : 1.87372 : stalled 3 times (last best: 1.87317) [2021-04-11 08:11:23] Ep. 6 : Up. 480000 : Sen. 35,150,435 : Cost 0.34758595 * 1,373,420,684 @ 26,957 after 65,839,067,458 : Time 7522.66s : 24854.42 words/s : L.r. 5.4772e-05 [2021-04-11 08:11:23] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 08:11:24] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 08:11:26] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 08:11:32] [valid] Ep. 6 : Up. 480000 : perplexity : 1.87407 : stalled 4 times (last best: 1.87317) [2021-04-11 10:16:32] Ep. 6 : Up. 490000 : Sen. 44,485,840 : Cost 0.34771168 * 1,370,478,187 @ 659 after 67,209,545,645 : Time 7508.89s : 24857.51 words/s : L.r. 5.4210e-05 [2021-04-11 10:16:32] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 10:16:33] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 10:16:35] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 10:16:41] [valid] Ep. 6 : Up. 490000 : perplexity : 1.87471 : stalled 5 times (last best: 1.87317) [2021-04-11 12:21:46] Ep. 6 : Up. 500000 : Sen. 53,822,456 : Cost 0.34730166 * 1,373,514,903 @ 13,124 after 68,583,060,548 : Time 7514.50s : 24852.71 words/s : L.r. 5.3666e-05 [2021-04-11 12:21:46] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 12:21:48] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 12:21:50] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 12:21:56] [valid] Ep. 6 : Up. 500000 : perplexity : 1.87397 : stalled 6 times (last best: 1.87317) [2021-04-11 14:27:16] Ep. 6 : Up. 510000 : Sen. 63,169,208 : Cost 0.34725401 * 1,375,161,291 @ 9,664 after 69,958,221,839 : Time 7529.19s : 24833.39 words/s : L.r. 5.3137e-05 [2021-04-11 14:27:16] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 14:27:18] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 14:27:19] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 14:27:26] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-11 14:27:27] [valid] Ep. 6 : Up. 510000 : perplexity : 1.87282 : new best [2021-04-11 16:32:24] Ep. 6 : Up. 520000 : Sen. 72,491,418 : Cost 0.34780228 * 1,369,249,864 @ 17,646 after 71,327,471,703 : Time 7508.23s : 24841.71 words/s : L.r. 5.2623e-05 [2021-04-11 16:32:24] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 16:32:26] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 16:32:28] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 16:32:34] [valid] Ep. 6 : Up. 520000 : perplexity : 1.8736 : stalled 1 times (last best: 1.87282) [2021-04-11 18:37:25] Ep. 6 : Up. 530000 : Sen. 81,810,410 : Cost 0.34746012 * 1,369,104,626 @ 16,983 after 72,696,576,329 : Time 7500.43s : 24836.37 words/s : L.r. 5.2125e-05 [2021-04-11 18:37:25] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 18:37:26] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 18:37:28] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 18:37:34] [valid] Ep. 6 : Up. 530000 : perplexity : 1.87476 : stalled 2 times (last best: 1.87282) [2021-04-11 18:47:40] Seen 82557593 samples [2021-04-11 18:47:40] Starting data epoch 7 in logical epoch 7 [2021-04-11 18:47:40] [sqlite] Selecting shuffled data [2021-04-11 20:44:13] Ep. 7 : Up. 540000 : Sen. 8,592,300 : Cost 0.34713489 * 1,372,241,563 @ 9,396 after 74,068,817,892 : Time 7607.95s : 24552.82 words/s : L.r. 5.1640e-05 [2021-04-11 20:44:13] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 20:44:15] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 20:44:16] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 20:44:23] [valid] Ep. 7 : Up. 540000 : perplexity : 1.87514 : stalled 3 times (last best: 1.87282) [2021-04-11 22:49:25] Ep. 7 : Up. 550000 : Sen. 17,925,772 : Cost 0.34683943 * 1,372,409,721 @ 25,831 after 75,441,227,613 : Time 7512.02s : 24850.60 words/s : L.r. 5.1168e-05 [2021-04-11 22:49:25] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-11 22:49:27] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-11 22:49:29] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-11 22:49:35] [valid] Ep. 7 : Up. 550000 : perplexity : 1.87457 : stalled 4 times (last best: 1.87282) [2021-04-12 00:54:25] Ep. 7 : Up. 560000 : Sen. 27,244,379 : Cost 0.34715670 * 1,369,822,708 @ 11,317 after 76,811,050,321 : Time 7500.20s : 24850.03 words/s : L.r. 5.0709e-05 [2021-04-12 00:54:25] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 00:54:27] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 00:54:29] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 00:54:35] [valid] Ep. 7 : Up. 560000 : perplexity : 1.87461 : stalled 5 times (last best: 1.87282) [2021-04-12 02:59:58] Ep. 7 : Up. 570000 : Sen. 36,604,903 : Cost 0.34729534 * 1,375,143,168 @ 40,535 after 78,186,193,489 : Time 7532.19s : 24852.11 words/s : L.r. 5.0262e-05 [2021-04-12 02:59:58] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 03:00:00] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 03:00:02] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 03:00:08] [valid] Ep. 7 : Up. 570000 : perplexity : 1.87325 : stalled 6 times (last best: 1.87282) [2021-04-12 05:04:58] Ep. 7 : Up. 580000 : Sen. 45,928,249 : Cost 0.34745684 * 1,369,657,670 @ 9,075 after 79,555,851,159 : Time 7499.53s : 24872.55 words/s : L.r. 4.9827e-05 [2021-04-12 05:04:58] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 05:04:59] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 05:05:01] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 05:05:07] [valid] Ep. 7 : Up. 580000 : perplexity : 1.87299 : stalled 7 times (last best: 1.87282) [2021-04-12 07:10:16] Ep. 7 : Up. 590000 : Sen. 55,278,057 : Cost 0.34726369 * 1,374,581,519 @ 19,910 after 80,930,432,678 : Time 7518.74s : 24876.96 words/s : L.r. 4.9403e-05 [2021-04-12 07:10:16] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 07:10:18] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 07:10:20] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 07:10:26] [valid] Ep. 7 : Up. 590000 : perplexity : 1.87319 : stalled 8 times (last best: 1.87282) [2021-04-12 09:15:17] Ep. 7 : Up. 600000 : Sen. 64,595,114 : Cost 0.34721729 * 1,369,736,691 @ 15,022 after 82,300,169,369 : Time 7500.62s : 24846.06 words/s : L.r. 4.8990e-05 [2021-04-12 09:15:17] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 09:15:19] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 09:15:21] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 09:15:27] [valid] Ep. 7 : Up. 600000 : perplexity : 1.87386 : stalled 9 times (last best: 1.87282) [2021-04-12 11:20:17] Ep. 7 : Up. 610000 : Sen. 73,907,880 : Cost 0.34705007 * 1,369,871,018 @ 9,601 after 83,670,040,387 : Time 7499.56s : 24841.24 words/s : L.r. 4.8587e-05 [2021-04-12 11:20:17] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 11:20:18] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 11:20:20] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 11:20:27] [valid] Ep. 7 : Up. 610000 : perplexity : 1.873 : stalled 10 times (last best: 1.87282) [2021-04-12 13:16:21] Seen 82557593 samples [2021-04-12 13:16:21] Starting data epoch 8 in logical epoch 8 [2021-04-12 13:16:21] [sqlite] Selecting shuffled data [2021-04-12 13:26:59] Ep. 8 : Up. 620000 : Sen. 688,755 : Cost 0.34734771 * 1,372,190,847 @ 23,673 after 85,042,231,234 : Time 7602.14s : 24572.73 words/s : L.r. 4.8193e-05 [2021-04-12 13:26:59] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 13:27:01] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 13:27:03] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 13:27:09] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-12 13:27:10] [valid] Ep. 8 : Up. 620000 : perplexity : 1.87256 : new best [2021-04-12 15:31:54] Ep. 8 : Up. 630000 : Sen. 9,996,709 : Cost 0.34698132 * 1,366,794,643 @ 11,902 after 86,409,025,877 : Time 7495.26s : 24840.78 words/s : L.r. 4.7809e-05 [2021-04-12 15:31:54] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 15:31:56] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 15:31:58] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 15:32:04] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-12 15:32:05] [valid] Ep. 8 : Up. 630000 : perplexity : 1.87236 : new best [2021-04-12 17:36:58] Ep. 8 : Up. 640000 : Sen. 19,310,020 : Cost 0.34669289 * 1,369,084,694 @ 22,478 after 87,778,110,571 : Time 7503.66s : 24820.15 words/s : L.r. 4.7434e-05 [2021-04-12 17:36:58] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 17:37:00] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 17:37:02] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 17:37:08] [valid] Ep. 8 : Up. 640000 : perplexity : 1.87305 : stalled 1 times (last best: 1.87236) [2021-04-12 19:42:40] Ep. 8 : Up. 650000 : Sen. 28,682,021 : Cost 0.34686610 * 1,378,889,785 @ 15,118 after 89,157,000,356 : Time 7541.87s : 24875.77 words/s : L.r. 4.7068e-05 [2021-04-12 19:42:40] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 19:42:43] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 19:42:45] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 19:42:52] [valid] Ep. 8 : Up. 650000 : perplexity : 1.87328 : stalled 2 times (last best: 1.87236) [2021-04-12 21:48:01] Ep. 8 : Up. 660000 : Sen. 38,024,736 : Cost 0.34720036 * 1,372,129,913 @ 22,654 after 90,529,130,269 : Time 7520.53s : 24852.76 words/s : L.r. 4.6710e-05 [2021-04-12 21:48:01] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 21:48:03] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 21:48:04] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 21:48:11] [valid] Ep. 8 : Up. 660000 : perplexity : 1.8724 : stalled 3 times (last best: 1.87236) [2021-04-12 23:53:07] Ep. 8 : Up. 670000 : Sen. 47,347,731 : Cost 0.34700289 * 1,368,893,477 @ 21,883 after 91,898,023,746 : Time 7506.11s : 24827.42 words/s : L.r. 4.6360e-05 [2021-04-12 23:53:07] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-12 23:53:09] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-12 23:53:11] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-12 23:53:17] [valid] Ep. 8 : Up. 670000 : perplexity : 1.87285 : stalled 4 times (last best: 1.87236) [2021-04-13 01:58:10] Ep. 8 : Up. 680000 : Sen. 56,664,281 : Cost 0.34627724 * 1,372,887,731 @ 32,838 after 93,270,911,477 : Time 7502.42s : 24847.20 words/s : L.r. 4.6018e-05 [2021-04-13 01:58:10] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 01:58:11] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 01:58:13] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 01:58:19] [valid] Ep. 8 : Up. 680000 : perplexity : 1.87281 : stalled 5 times (last best: 1.87236) [2021-04-13 04:03:16] Ep. 8 : Up. 690000 : Sen. 65,979,349 : Cost 0.34664732 * 1,370,289,074 @ 10,359 after 94,641,200,551 : Time 7506.07s : 24816.22 words/s : L.r. 4.5683e-05 [2021-04-13 04:03:16] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 04:03:18] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 04:03:20] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 04:03:26] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-13 04:03:27] [valid] Ep. 8 : Up. 690000 : perplexity : 1.87234 : new best [2021-04-13 06:08:57] Ep. 8 : Up. 700000 : Sen. 75,348,200 : Cost 0.34710148 * 1,376,737,182 @ 37,012 after 96,017,937,733 : Time 7541.21s : 24853.10 words/s : L.r. 4.5356e-05 [2021-04-13 06:08:57] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 06:09:00] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 06:09:02] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 06:09:08] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-13 06:09:09] [valid] Ep. 8 : Up. 700000 : perplexity : 1.87199 : new best [2021-04-13 07:45:44] Seen 82557593 samples [2021-04-13 07:45:44] Starting data epoch 9 in logical epoch 9 [2021-04-13 07:45:44] [sqlite] Selecting shuffled data [2021-04-13 08:15:13] Ep. 9 : Up. 710000 : Sen. 2,098,993 : Cost 0.34647191 * 1,369,525,955 @ 8,184 after 97,387,463,688 : Time 7576.23s : 24573.76 words/s : L.r. 4.5035e-05 [2021-04-13 08:15:13] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 08:15:15] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 08:15:17] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 08:15:23] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-13 08:15:24] [valid] Ep. 9 : Up. 710000 : perplexity : 1.87139 : new best [2021-04-13 10:20:35] Ep. 9 : Up. 720000 : Sen. 11,443,913 : Cost 0.34645224 * 1,373,043,408 @ 11,100 after 98,760,507,096 : Time 7521.36s : 24842.06 words/s : L.r. 4.4721e-05 [2021-04-13 10:20:35] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 10:20:37] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 10:20:39] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 10:20:45] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-13 10:20:46] [valid] Ep. 9 : Up. 720000 : perplexity : 1.87127 : new best [2021-04-13 12:25:44] Ep. 9 : Up. 730000 : Sen. 20,771,519 : Cost 0.34645385 * 1,371,854,873 @ 16,743 after 100,132,361,969 : Time 7509.21s : 24848.09 words/s : L.r. 4.4414e-05 [2021-04-13 12:25:44] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 12:25:46] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 12:25:48] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 12:25:55] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.best-perplexity.npz [2021-04-13 12:25:56] [valid] Ep. 9 : Up. 730000 : perplexity : 1.87003 : new best [2021-04-13 14:31:30] Ep. 9 : Up. 740000 : Sen. 30,141,045 : Cost 0.34671330 * 1,376,777,976 @ 15,303 after 101,509,139,945 : Time 7545.25s : 24839.62 words/s : L.r. 4.4113e-05 [2021-04-13 14:31:30] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 14:31:32] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 14:31:34] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 14:31:40] [valid] Ep. 9 : Up. 740000 : perplexity : 1.87066 : stalled 1 times (last best: 1.87003) [2021-04-13 16:36:22] Ep. 9 : Up. 750000 : Sen. 39,433,793 : Cost 0.34630421 * 1,368,773,195 @ 24,166 after 102,877,913,140 : Time 7491.89s : 24834.20 words/s : L.r. 4.3818e-05 [2021-04-13 16:36:22] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 16:36:24] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 16:36:26] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 16:36:33] [valid] Ep. 9 : Up. 750000 : perplexity : 1.87086 : stalled 2 times (last best: 1.87003) [2021-04-13 18:41:31] Ep. 9 : Up. 760000 : Sen. 48,764,304 : Cost 0.34690219 * 1,370,454,043 @ 10,547 after 104,248,367,183 : Time 7508.99s : 24846.00 words/s : L.r. 4.3529e-05 [2021-04-13 18:41:31] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.orig.npz [2021-04-13 18:41:33] Saving model weights and runtime parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz [2021-04-13 18:41:35] Saving Adam parameters to /users/tiedeman/research/Opus-MT-train/work-tatoeba/eng-nld/opus+bt.spm32k-spm32k.transformer-align.model1.npz.optimizer.npz [2021-04-13 18:41:41] [valid] Ep. 9 : Up. 760000 : perplexity : 1.87066 : stalled 3 times (last best: 1.87003)