Spaces:

eubinecto
/

idiomify

Runtime error

App Files Files Community

eubinecto commited on Jan 20, 2022

Commit

1bf3d62

•

1 Parent(s): cfa482d

first mvp done, removing wandb

Browse files

Files changed (28) hide show

config.yaml +46 -0
explore/explore_bert_base_multilingual_tokenizer.py +44 -0
explore/explore_bert_base_tokenizer.py +45 -0
explore/explore_fetch_idiom2def.py +15 -0
explore/explore_fetch_idioms.py +9 -0
explore/explore_fetch_wisdom2def.py +0 -15
idiomify/datamodules.py +75 -0
idiomify/fetchers.py +24 -9
idiomify/models.py +269 -4
idiomify/paths.py +7 -2
idiomify/tensors.py +55 -0
main_train.py +67 -0
requirements.txt +0 -68
wandb/latest-run +0 -1
wandb/run-20220120_131057-39a70no5/files/conda-environment.yaml +0 -82
wandb/run-20220120_131057-39a70no5/files/config.yaml +0 -21
wandb/run-20220120_131057-39a70no5/files/diff.patch +0 -77
wandb/run-20220120_131057-39a70no5/files/requirements.txt +0 -62
wandb/run-20220120_131057-39a70no5/files/wandb-metadata.json +0 -31
wandb/run-20220120_131057-39a70no5/files/wandb-summary.json +0 -1
wandb/run-20220120_131057-39a70no5/run-39a70no5.wandb +0 -0
wandb/run-20220120_131124-isjyx9fs/files/conda-environment.yaml +0 -82
wandb/run-20220120_131124-isjyx9fs/files/config.yaml +0 -21
wandb/run-20220120_131124-isjyx9fs/files/diff.patch +0 -77
wandb/run-20220120_131124-isjyx9fs/files/requirements.txt +0 -62
wandb/run-20220120_131124-isjyx9fs/files/wandb-metadata.json +0 -31
wandb/run-20220120_131124-isjyx9fs/files/wandb-summary.json +0 -1
wandb/run-20220120_131124-isjyx9fs/run-isjyx9fs.wandb +0 -0

config.yaml CHANGED Viewed

	@@ -0,0 +1,46 @@

+alpha:
+  eng2eng:
+    bert: bert-base-uncased
+    desc:
+    seed: 410
+    idioms_ver: c
+    idiom2def_ver: c
+    k: 11
+    lr: 0.00001
+    max_epochs: 200
+    batch_size: 64
+    shuffle: true
+  kor2eng:
+    bert: bert-base-multilingual-uncased
+    desc:
+    seed: 410
+    idioms_ver: c
+    idiom2def_ver: d
+    k: 11
+    lr: 0.00001
+    max_epochs: 200
+    batch_size: 64
+    num_workers: 4
+    shuffle: true
+gamma:
+  eng2eng:
+    bert: bert-base-uncased
+    seed: 410
+    idioms_ver: c
+    idiom2def_ver: c
+    k: 11
+    lr: 0.00001
+    max_epochs: 200
+    batch_size: 64
+    shuffle: true
+  kor2eng:
+    bert: bert-base-multilingual-uncased
+    seed: 410
+    idioms_ver: c
+    idiom2def_ver: d
+    k: 11
+    lr: 0.00001
+    max_epochs: 200
+    batch_size: 64
+    num_workers: 4
+    shuffle: true

explore/explore_bert_base_multilingual_tokenizer.py ADDED Viewed

	@@ -0,0 +1,44 @@

+from idiomify.fetchers import fetch_idiom2def
+from transformers import AutoTokenizer, BertTokenizer, BertTokenizerFast
+def main():
+    tokenizer = BertTokenizer.from_pretrained("bert-base-multilingual-uncased")
+    idiom2def = fetch_idiom2def("d")  # eng2kor
+    for idiom, definition in idiom2def:
+        print(tokenizer.decode(tokenizer(idiom)['input_ids']),
+              tokenizer.decode(tokenizer(definition)['input_ids']))
+# right, the tokenizer knows Korean, which is great.
+"""
+/opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo/bin/python /Users/eubinecto/Desktop/Projects/Toy/idiomify-demo/explore/explore_mbert_tokenizer.py
+[CLS] beat around the bush [SEP] [CLS] 불쾌하거나 민감한 주제에 대해 직접적으로 이야기하는 것을 피하기 위해 모호하거나 완곡하게 말한다. [SEP]
+[CLS] beat around the bush [SEP] [CLS] 단어나 태도가 우회적이다 [SEP]
+[CLS] beat around the bush [SEP] [CLS] 우물쭈물하다 [SEP]
+[CLS] beat around the bush [SEP] [CLS] 우회적으로 접근하다 [SEP]
+[CLS] backhanded compliment [SEP] [CLS] 칭찬으로 가장한 모욕적이거나 부정적인 논평 [SEP]
+[CLS] backhanded compliment [SEP] [CLS] 의도하지 않거나 애매한 칭찬 [SEP]
+[CLS] backhanded compliment [SEP] [CLS] 누군가를 칭찬하는 것 같지만 비판으로도 이해될 수 있는 말 [SEP]
+[CLS] backhanded compliment [SEP] [CLS] 남을 기쁘게 하는 말 같지만 모욕이 될 수도 있는 말 [SEP]
+[CLS] backhanded compliment [SEP] [CLS] 감탄하는 듯 하면서도 모욕으로 이해될 수 있는 말 [SEP]
+[CLS] steer clear of [SEP] [CLS] 누군가나 뭔가를 피하다 [SEP]
+[CLS] steer clear of [SEP] [CLS] 떨어져 지내다 [SEP]
+[CLS] steer clear of [SEP] [CLS] 피하거나 멀리하도록 주의하다 [SEP]
+[CLS] steer clear of [SEP] [CLS] 불쾌하거나 위험하거나 문제를 일으킬 것 같은 사람이나 물건을 피하다 [SEP]
+[CLS] steer clear of [SEP] [CLS] 일부러 피하다 [SEP]
+[CLS] dish it out [SEP] [CLS] 가혹한 생각, 비판, 또는 모욕의 목소리를 내는 것. [SEP]
+[CLS] dish it out [SEP] [CLS] 누군가 또는 무언가에 대해 험담하는 것 [SEP]
+[CLS] dish it out [SEP] [CLS] 어떤 것을 주거나 정보나 당신의 의견과 같은 것을 말하는 것 [SEP]
+[CLS] dish it out [SEP] [CLS] 다른 사람을 쉽게 비판하지만 다른 사람이 자신을 비판할때는 좋아하지 않음 [SEP]
+[CLS] dish it out [SEP] [CLS] 다른 사람을 비판하다 [SEP]
+[CLS] make headway [SEP] [CLS] 성취하고자 하는 어떤 것에 진척이 생기다 [SEP]
+[CLS] make headway [SEP] [CLS] 특히 이것이 느리거나 어려울 때, 진전을 이루다. [SEP]
+[CLS] make headway [SEP] [CLS] 전진하다 [SEP]
+[CLS] make headway [SEP] [CLS] 앞으로 나아가거나 진전을 이루다 [SEP]
+[CLS] make headway [SEP] [CLS] 성공하기 시작하다 [SEP]
+"""
+if __name__ == '__main__':
+    main()

explore/explore_bert_base_tokenizer.py ADDED Viewed

	@@ -0,0 +1,45 @@

+from idiomify.fetchers import fetch_idiom2def
+from transformers import AutoTokenizer, BertTokenizer, BertTokenizerFast
+def main():
+    tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
+    idiom2def = fetch_idiom2def("c")  # eng2eng
+    for idiom, definition in idiom2def:
+        print(tokenizer.decode(tokenizer(idiom)['input_ids']),
+              tokenizer.decode(tokenizer(definition)['input_ids']))
+"""
+/opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo/bin/python /Users/eubinecto/Desktop/Projects/Toy/idiomify-demo/explore/explore_bert_base_tokenizer.py
+Downloading: 100%|██████████| 226k/226k [00:00<00:00, 298kB/s]
+Downloading: 100%|██████████| 28.0/28.0 [00:00<00:00, 8.27kB/s]
+Downloading: 100%|██████████| 455k/455k [00:01<00:00, 449kB/s]
+[CLS] beat around the bush [SEP] [CLS] to speak vaguely or euphemistically so as to avoid talkingdirectly about an unpleasant or sensitive topic [SEP]
+[CLS] beat around the bush [SEP] [CLS] indirection in word or deed [SEP]
+[CLS] beat around the bush [SEP] [CLS] to shilly - shally [SEP]
+[CLS] beat around the bush [SEP] [CLS] to approach something in a roundabout way [SEP]
+[CLS] backhanded compliment [SEP] [CLS] an insulting or negative comment disguised as praise. [SEP]
+[CLS] backhanded compliment [SEP] [CLS] an unintended or ambiguous compliment. [SEP]
+[CLS] backhanded compliment [SEP] [CLS] a remark which seems to be praising someone or something but which could also be understood as criticism [SEP]
+[CLS] backhanded compliment [SEP] [CLS] a remark that seems to say something pleasant about a person but could also be an insult [SEP]
+[CLS] backhanded compliment [SEP] [CLS] a remark that seems to express admiration but could also be understood as an insult [SEP]
+[CLS] steer clear of [SEP] [CLS] to avoid someone or something. [SEP]
+[CLS] steer clear of [SEP] [CLS] stay away from [SEP]
+[CLS] steer clear of [SEP] [CLS] take care to avoid or keep away from [SEP]
+[CLS] steer clear of [SEP] [CLS] to avoid someone or something that seems unpleasant, dangerous, or likely to cause problems [SEP]
+[CLS] steer clear of [SEP] [CLS] deliberately avoid someone [SEP]
+[CLS] dish it out [SEP] [CLS] to voice harsh thoughts, criticisms, or insults. [SEP]
+[CLS] dish it out [SEP] [CLS] to gossip about someone or something [SEP]
+[CLS] dish it out [SEP] [CLS] to give something, or to tell something such as information or your opinions [SEP]
+[CLS] dish it out [SEP] [CLS] someone easily criticizes other people but does not like it when other people criticize him or her [SEP]
+[CLS] dish it out [SEP] [CLS] to criticize other people [SEP]
+[CLS] make headway [SEP] [CLS] make progress with something that you are trying to achieve. [SEP]
+[CLS] make headway [SEP] [CLS] make progress, especially when this is slow or difficult [SEP]
+[CLS] make headway [SEP] [CLS] to advance. [SEP]
+[CLS] make headway [SEP] [CLS] to move forward or make progress [SEP]
+[CLS] make headway [SEP] [CLS] to begin to succeed [SEP]
+"""
+if __name__ == '__main__':
+    main()

explore/explore_fetch_idiom2def.py ADDED Viewed

	@@ -0,0 +1,15 @@

+from idiomify.fetchers import fetch_idiom2def
+def main():
+    idiom2def = fetch_idiom2def("c")
+    for idiom, definition in idiom2def:
+        print(idiom, definition)
+    df = fetch_idiom2def("d")
+    for idiom, definition in idiom2def:
+        print(idiom, definition)
+if __name__ == '__main__':
+    main()

explore/explore_fetch_idioms.py ADDED Viewed

	@@ -0,0 +1,9 @@

+from idiomify.fetchers import fetch_idioms
+def main():
+    print(fetch_idioms("c"))
+if __name__ == '__main__':
+    main()

explore/explore_fetch_wisdom2def.py DELETED Viewed

@@ -1,15 +0,0 @@
-from idiomify.fetchers import fetch_wisdom2def
-def main():
-    df = fetch_wisdom2def("c")
-    for idx, row in df.iterrows():
-        print(row[0], row[1])
-    df = fetch_wisdom2def("d")
-    for idx, row in df.iterrows():
-        print(row[0], row[1])
-if __name__ == '__main__':
-    main()

idiomify/datamodules.py ADDED Viewed

	@@ -0,0 +1,75 @@

+import torch
+from typing import Tuple, Optional, List
+from torch.utils.data import Dataset, DataLoader
+from pytorch_lightning import LightningDataModule
+from transformers import BertTokenizer
+from idiomify.fetchers import fetch_idiom2def
+from idiomify import tensors as T
+class IdiomifyDataset(Dataset):
+    def __init__(self,
+                 X: torch.Tensor,
+                 y: torch.Tensor):
+        self.X = X
+        self.y = y
+    def __len__(self) -> int:
+        """
+        Returning the size of the dataset
+        :return:
+        """
+        return self.y.shape[0]
+    def __getitem__(self, idx: int) -> Tuple[torch.Tensor, torch.LongTensor]:
+        """
+        Returns features & the label
+        :param idx:
+        :return:
+        """
+        return self.X[idx], self.y[idx]
+class IdiomifyDataModule(LightningDataModule):
+    # boilerplate - just ignore these
+    def test_dataloader(self):
+        pass
+    def val_dataloader(self):
+        pass
+    def predict_dataloader(self):
+        pass
+    def __init__(self,
+                 config: dict,
+                 tokenizer: BertTokenizer,
+                 idioms: List[str]):
+        super().__init__()
+        self.config = config
+        self.tokenizer = tokenizer
+        self.idioms = idioms
+        # --- to be downloaded & built --- #
+        self.idiom2def: Optional[List[Tuple[str, str]]] = None
+        self.dataset: Optional[IdiomifyDataset] = None
+    def prepare_data(self):
+        """
+        prepare: download all data needed for this from wandb to local.
+        """
+        self.idiom2def = fetch_idiom2def(self.config['idiom2def_ver'])
+    def setup(self, stage: Optional[str] = None):
+        """
+        setup the builders.
+        """
+        # --- set up the builders --- #
+        # build the datasets
+        X = T.inputs([definition for _, definition in self.idiom2def], self.tokenizer, self.config['k'])
+        y = T.targets(self.idioms)
+        self.dataset = IdiomifyDataset(X, y)
+    def train_dataloader(self) -> DataLoader:
+        return DataLoader(self.dataset, batch_size=self.config['batch_size'],
+                          shuffle=self.config['shuffle'], num_workers=self.config['num_workers'])

idiomify/fetchers.py CHANGED Viewed

@@ -1,18 +1,34 @@
 import wandb
 import pandas as pd
-from transformers import BertTokenizer
 from idiomify.models import Alpha, Gamma
-from idiomify.paths import wisdom2def_dir
 # dataset
-def fetch_wisdom2def(ver: str) -> pd.DataFrame:
-    artifact = wandb.Api().artifact(f"eubinecto/idiomify-demo/wisdom2def:{ver}", type="dataset")
-    artifact_path = wisdom2def_dir(ver)
     artifact.download(root=str(artifact_path))
     tsv_path = artifact_path / "all.tsv"
     df = pd.read_csv(str(tsv_path), delimiter="\t")
-    return df
 # models
@@ -25,6 +41,5 @@ def fetch_gamma(ver: str) -> Gamma:
 def fetch_config() -> dict:
-    pass

+from typing import Tuple, List
+import yaml
 import wandb
 import pandas as pd
 from idiomify.models import Alpha, Gamma
+from idiomify.paths import idiom2def_dir, CONFIG_YAML, idioms_dir
 # dataset
+def fetch_idiom2def(ver: str) -> List[Tuple[str, str]]:
+    artifact = wandb.Api().artifact(f"eubinecto/idiomify-demo/idiom2def:{ver}", type="dataset")
+    artifact_path = idiom2def_dir(ver)
     artifact.download(root=str(artifact_path))
     tsv_path = artifact_path / "all.tsv"
     df = pd.read_csv(str(tsv_path), delimiter="\t")
+    return [
+        (row[0], row[1])
+        for _, row in df.iterrows()
+    ]
+def fetch_idioms(ver: str) -> List[str]:
+    artifact = wandb.Api().artifact(f"eubinecto/idiomify-demo/idioms:{ver}", type="dataset")
+    artifact_path = idioms_dir(ver)
+    artifact.download(root=str(artifact_path))
+    tsv_path = artifact_path / "all.tsv"
+    df = pd.read_csv(str(tsv_path), delimiter="\t")
+    return [
+        row[0]
+        for _, row in df.iterrows()
+    ]
 # models
 def fetch_config() -> dict:
+    with open(str(CONFIG_YAML), 'r', encoding="utf-8") as fh:
+        return yaml.safe_load(fh)

idiomify/models.py CHANGED Viewed

@@ -1,9 +1,274 @@
-class Alpha:
-    pass
-class Gamma:
-    pass

+"""
+The reverse dictionary models below are based off of: https://github.com/yhcc/BertForRD/blob/master/mono/model/bert.py
+"""
+from typing import Tuple, List, Optional
+import torch
+import pytorch_lightning as pl
+from transformers.models.bert.modeling_bert import BertForMaskedLM
+from torch.nn import functional as F
+class RD(pl.LightningModule):
+    """
+    @eubinecto
+    The superclass of all the reverse-dictionaries. This class houses any methods that are required by
+    whatever reverse-dictionaries we define.
+    """
+    # --- boilerplate; the loaders are defined in datamodules, so we don't define them here
+    # passing them to avoid warnings ---  #
+    def train_dataloader(self):
+        pass
+    def test_dataloader(self):
+        pass
+    def val_dataloader(self):
+        pass
+    def predict_dataloader(self):
+        pass
+    def __init__(self, mlm: BertForMaskedLM, wisdom2subwords: torch.Tensor, k: int, lr: float):  # noqa
+        """
+        :param mlm: a bert model for masked language modeling
+        :param wisdom2subwords: (|W|, K)
+        :return: (N, K, |V|); (num samples, k, the size of the vocabulary of subwords)
+        """
+        super().__init__()
+        # -- hyper params --- #
+        # should be saved to self.hparams
+        # https://github.com/PyTorchLightning/pytorch-lightning/issues/4390#issue-730493746
+        self.save_hyperparameters(ignore=["mlm", "wisdom2subwords"])
+        # -- the only neural network we need -- #
+        self.mlm = mlm
+        # --- to be used for getting H_k --- #
+        self.wisdom_mask: Optional[torch.Tensor] = None  # (N, L)
+        # --- to be used for getting H_desc --- #
+        self.desc_mask: Optional[torch.Tensor] = None  # (N, L)
+        # -- constant tensors -- #
+        self.register_buffer("wisdom2subwords", wisdom2subwords)  # (|W|, K)
+    def forward(self, X: torch.Tensor) -> torch.Tensor:
+        """
+        :param X: (N, 4, L);
+         (num samples, 0=input_ids/1=token_type_ids/2=attention_mask/3=wisdom_mask, the maximum length)
+        :return: (N, L, H); (num samples, k, the size of the vocabulary of subwords)
+        """
+        input_ids = X[:, 0]  # (N, 4, L) -> (N, L)
+        token_type_ids = X[:, 1]  # (N, 4, L) -> (N, L)
+        attention_mask = X[:, 2]  # (N, 4, L) -> (N, L)
+        self.wisdom_mask = X[:, 3]  # (N, 4, L) -> (N, L)
+        self.desc_mask = X[:, 4]  # (N, 4, L) -> (N, L)
+        H_all = self.mlm.bert.forward(input_ids, attention_mask, token_type_ids)[0]  # (N, 3, L) -> (N, L, H)
+        return H_all
+    def H_k(self, H_all: torch.Tensor) -> torch.Tensor:
+        """
+        You may want to override this. (e.g. RDGamma - the k's could be anywhere)
+        :param H_all (N, L, H)
+        :return H_k (N, K, H)
+        """
+        N, _, H = H_all.size()
+        # refer to: wisdomify/examples/explore_masked_select.py
+        wisdom_mask = self.wisdom_mask.unsqueeze(2).expand(H_all.shape)  # (N, L) -> (N, L, 1) -> (N, L, H)
+        H_k = torch.masked_select(H_all, wisdom_mask.bool())  # (N, L, H), (N, L, H) -> (N * K * H)
+        H_k = H_k.reshape(N, self.hparams['k'], H)  # (N * K * H) -> (N, K, H)
+        return H_k
+    def H_desc(self, H_all: torch.Tensor) -> torch.Tensor:
+        """
+        :param H_all (N, L, H)
+        :return H_desc (N, L - (K + 3), H)
+        """
+        N, L, H = H_all.size()
+        desc_mask = self.desc_mask.unsqueeze(2).expand(H_all.shape)
+        H_desc = torch.masked_select(H_all, desc_mask.bool())  # (N, L, H), (N, L, H) -> (N * (L - (K + 3)) * H)
+        H_desc = H_desc.reshape(N, L - (self.hparams['k'] + 3), H)  # (N * (L - (K + 3)) * H) -> (N, L - (K + 3), H)
+        return H_desc
+    def S_wisdom_literal(self, H_k: torch.Tensor) -> torch.Tensor:
+        """
+        To be used for both RDAlpha & RDBeta
+        :param H_k: (N, K, H)
+        :return: S_wisdom_literal (N, |W|)
+        """
+        S_vocab = self.mlm.cls(H_k)  # bmm; (N, K, H) * (H, |V|) ->  (N, K, |V|)
+        indices = self.wisdom2subwords.T.repeat(S_vocab.shape[0], 1, 1)  # (|W|, K) -> (N, K, |W|)
+        S_wisdom_literal = S_vocab.gather(dim=-1, index=indices)  # (N, K, |V|) -> (N, K, |W|)
+        S_wisdom_literal = S_wisdom_literal.sum(dim=1)  # (N, K, |W|) -> (N, |W|)
+        return S_wisdom_literal
+    def S_wisdom(self, H_all: torch.Tensor) -> torch.Tensor:
+        """
+        :param H_all: (N, L, H)
+        :return S_wisdom: (N, |W|)
+        """
+        raise NotImplementedError("An RD class must implement S_wisdom")
+    def P_wisdom(self, X: torch.Tensor) -> torch.Tensor:
+        """
+        :param X: (N, 3, L)
+        :return P_wisdom: (N, |W|), normalized over dim 1.
+        """
+        H_all = self.forward(X)  # (N, 3, L) -> (N, L, H)
+        S_wisdom = self.S_wisdom(H_all)  # (N, L, H) -> (N, W)
+        P_wisdom = F.softmax(S_wisdom, dim=1)  # (N, W) -> (N, W)
+        return P_wisdom
+    def training_step(self, batch: Tuple[torch.Tensor, torch.Tensor], batch_idx: int) -> dict:
+        X, y = batch
+        H_all = self.forward(X)  # (N, 3, L) -> (N, L, H)
+        S_wisdom = self.S_wisdom(H_all)  # (N, L, H) -> (N, |W|)
+        loss = F.cross_entropy(S_wisdom, y)  # (N, |W|), (N,) -> (N,)
+        loss = loss.sum()  # (N,) -> (1,)
+        # so that the metrics accumulate over the course of this epoch
+        # why dict? - just a boilerplate
+        return {
+            # you cannot change the keyword for the loss
+            "loss": loss,
+        }
+    def on_train_batch_end(self, outputs: dict, *args, **kwargs) -> None:
+        # watch the loss for this batch
+        self.log("Train/Loss", outputs['loss'])
+    def training_epoch_end(self, outputs: List[dict]) -> None:
+        # to see an average performance over the batches in this specific epoch
+        avg_loss = torch.stack([output['loss'] for output in outputs]).mean()
+        self.log("Train/Average Loss", avg_loss)
+    def validation_step(self, batch: Tuple[torch.Tensor, torch.Tensor], batch_idx: int) -> dict:
+        return self.training_step(batch, batch_idx)
+    def on_validation_batch_end(self, outputs: dict, *args, **kwargs) -> None:
+        self.log("Validation/Loss", outputs['loss'])
+    def validation_epoch_end(self, outputs: List[dict]) -> None:
+        # to see an average performance over the batches in this specific epoch
+        avg_loss = torch.stack([output['loss'] for output in outputs]).mean()
+        self.log("Validation/Average Loss", avg_loss)
+    def configure_optimizers(self) -> torch.optim.Optimizer:
+        """
+        Instantiates and returns the optimizer to be used for this model
+        e.g. torch.optim.Adam
+        """
+        # The authors used Adam, so we might as well use it as well.
+        return torch.optim.AdamW(self.parameters(), lr=self.hparams['lr'])
+    @classmethod
+    def name(cls) -> str:
+        return cls.__name__.lower()
+class Alpha(RD):
+    """
+    @eubinecto
+    The first prototype.
+    S_wisdom = S_wisdom_literal
+    trained on: wisdom2def only.
+    """
+    def S_wisdom(self, H_all: torch.Tensor) -> torch.Tensor:
+        H_k = self.H_k(H_all)  # (N, L, H) -> (N, K, H)
+        S_wisdom = self.S_wisdom_literal(H_k)  # (N, K, H) -> (N, |W|)
+        return S_wisdom
+class BiLSTMPooler(torch.nn.Module):
+    def __init__(self, hidden_size: int):
+        super().__init__()
+        self.lstm = torch.nn.LSTM(input_size=hidden_size, hidden_size=hidden_size // 2, batch_first=True,
+                                  num_layers=1, bidirectional=True)
+    def forward(self, X: torch.Tensor) -> torch.Tensor:
+        hiddens, _ = self.lstm(X)
+        return hiddens[:, -1]
+class Gamma(RD):
+    """
+    @eubinecto
+    S_wisdom  = S_wisdom_literal + S_wisdom_figurative
+    but the way we get S_wisdom_figurative is much simplified, compared with RDBeta.
+    """
+    def __init__(self, mlm: BertForMaskedLM, wisdom2subwords: torch.Tensor, k: int, lr: float):
+        super().__init__(mlm, wisdom2subwords, k, lr)
+        # a pooler is a multilayer perceptron that pools wisdom_embeddings from wisdom2subwords_embeddings
+        self.pooler = BiLSTMPooler(self.mlm.config.hidden_size)
+        # --- to be used to compute  attentions --- #
+        self.attention_mask: Optional[torch.Tensor] = None
+    def forward(self, X: torch.Tensor) -> torch.Tensor:
+        """
+        :param X: (N, 4, L);
+         (num samples, 0=input_ids/1=token_type_ids/2=attention_mask/3=wisdom_mask, the maximum length)
+        :return: (N, L, H); (num samples, k, the size of the vocabulary of subwords)
+        """
+        input_ids = X[:, 0]  # (N, 4, L) -> (N, L)
+        token_type_ids = X[:, 1]  # (N, 4, L) -> (N, L)
+        self.attention_mask = X[:, 2]  # (N, 4, L) -> (N, L)
+        self.wisdom_mask = X[:, 3]  # (N, 4, L) -> (N, L)
+        self.desc_mask = X[:, 4]  # (N, 4, L) -> (N, L)
+        H_all = self.mlm.bert.forward(input_ids, self.attention_mask, token_type_ids)[0]  # (N, 3, L) -> (N, L, H)
+        return H_all
+    def H_desc_attention_mask(self, attention_mask: torch.Tensor) -> torch.Tensor:
+        """
+        this is needed mask the padding tokens
+        :param attention_mask: (N, L)
+        """
+        N, L = attention_mask.size()
+        H_desc_attention_mask = torch.masked_select(attention_mask, self.desc_mask.bool())
+        H_desc_attention_mask = H_desc_attention_mask.reshape(N, L - (self.hparams['k'] + 3))
+        return H_desc_attention_mask
+    def S_wisdom(self, H_all: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
+        S_wisdom_literal = self.S_wisdom_literal(self.H_k(H_all))
+        S_wisdom_figurative = self.S_wisdom_figurative(H_all)
+        S_wisdom = S_wisdom_literal + S_wisdom_figurative
+        return S_wisdom, S_wisdom_literal, S_wisdom_figurative
+    def S_wisdom_figurative(self, H_all: torch.Tensor) -> torch.Tensor:
+        # --- draw the embeddings for wisdoms from  the embeddings of wisdom2subwords -- #
+        # this is to use as less of newly initialised weights as possible
+        wisdom2subwords_embeddings = self.mlm.bert \
+            .embeddings.word_embeddings(self.wisdom2subwords)  # (W, K)  -> (W, K, H)
+        wisdom_embeddings = self.pooler(wisdom2subwords_embeddings).squeeze()  # (W, H, K) -> (W, H, 1) -> (W, H)
+        # --- draw H_wisdom from H_desc with attention --- #
+        H_cls = H_all[:, 0]  # (N, L, H) -> (N, H)
+        H_desc = self.H_desc(H_all)  # (N, L, H) -> (N, D, H)
+        H_desc_attention_mask = self.H_desc_attention_mask(self.attention_mask)  # (N, L) -> (N, D)
+        scores = torch.einsum("...h,...dh->...d", H_cls, H_desc)  # (N, D)
+        # ignore the padding tokens
+        scores = torch.masked_fill(scores, H_desc_attention_mask != 1, float("-inf"))  # (N, D)
+        attentions = torch.softmax(scores, dim=1)  # over D
+        H_wisdom = torch.einsum("...d,...dh->...h", attentions, H_desc)  # -> (N, H)
+        # --- now compare H_wisdom with all the wisdoms --- #
+        S_wisdom_figurative = torch.einsum("...h,wh->...w", H_wisdom, wisdom_embeddings)  # (N, H) * (W, H) -> (N, W)
+        return S_wisdom_figurative
+    def training_step(self, batch: Tuple[torch.Tensor, torch.Tensor], batch_idx: int) -> dict:
+        X, y = batch
+        H_all = self.forward(X)  # (N, 3, L) -> (N, L, H)
+        S_wisdom, S_wisdom_literal, S_wisdom_figurative = self.S_wisdom(H_all)  # (N, L, H) -> (N, |W|)
+        loss_all = F.cross_entropy(S_wisdom, y).sum()  # (N, |W|), (N,) -> (N,) -> (1,)
+        loss_literal = F.cross_entropy(S_wisdom_literal, y).sum()  # (N, |W|), (N,) -> (N,) -> (1,)
+        loss_figurative = F.cross_entropy(S_wisdom_figurative, y).sum()  # (N, |W|), (N,) -> (N,) -> (1,)
+        loss = loss_all + loss_literal + loss_figurative  # unweighted multi-task learning
+        return {
+            # you cannot change the keyword for the loss
+            "loss": loss,
+        }
+    def P_wisdom(self, X: torch.Tensor) -> torch.Tensor:
+        """
+        :param X: (N, 3, L)
+        :return P_wisdom: (N, |W|), normalized over dim 1.
+        """
+        H_all = self.forward(X)  # (N, 3, L) -> (N, L, H)
+        S_wisdom, _, _ = self.S_wisdom(H_all)  # (N, L, H) -> (N, W)
+        P_wisdom = F.softmax(S_wisdom, dim=1)  # (N, W) -> (N, W)
+        return P_wisdom

idiomify/paths.py CHANGED Viewed

@@ -2,10 +2,15 @@ from pathlib import Path
 ROOT_DIR = Path(__file__).resolve().parent.parent
 ARTIFACTS_DIR = ROOT_DIR / "artifacts"
-def wisdom2def_dir(ver: str) -> Path:
-    return ARTIFACTS_DIR / f"wisdom2def_{ver}"
 def alpha_dir(ver: str) -> Path:

 ROOT_DIR = Path(__file__).resolve().parent.parent
 ARTIFACTS_DIR = ROOT_DIR / "artifacts"
+CONFIG_YAML = ROOT_DIR / "config.yaml"
+def idiom2def_dir(ver: str) -> Path:
+    return ARTIFACTS_DIR / f"idiom2def_{ver}"
+def idioms_dir(ver: str) -> Path:
+    return ARTIFACTS_DIR / f"idioms_{ver}"
 def alpha_dir(ver: str) -> Path:

idiomify/tensors.py ADDED Viewed

	@@ -0,0 +1,55 @@

+"""
+all the functions for building tensors are defined here.
+builders must accept device as one of the parameters.
+"""
+import torch
+from typing import List
+from transformers import BertTokenizer
+def wisdom2subwords(idioms: List[str], tokenizer: BertTokenizer, k: int) -> torch.Tensor:
+    mask_id = tokenizer.mask_token_id
+    pad_id = tokenizer.pad_token_id
+    # temporarily disable single-token status of the wisdoms
+    wisdoms = [idiom.split(" ") for idiom in idioms]
+    encodings = tokenizer(text=wisdoms,
+                          add_special_tokens=False,
+                          # should set this to True, as we already have the wisdoms split.
+                          is_split_into_words=True,
+                          padding='max_length',
+                          max_length=k,  # set to k
+                          return_tensors="pt")
+    input_ids = encodings['input_ids']
+    input_ids[input_ids == pad_id] = mask_id  # replace them with masks
+    return input_ids
+def inputs(definitions: List[str], tokenizer: BertTokenizer, k: int) -> torch.Tensor:
+    lefts = [" ".join(["[MASK]"] * k)] * len(definitions)
+    encodings = tokenizer(text=lefts,
+                          text_pair=definitions,
+                          return_tensors="pt",
+                          add_special_tokens=True,
+                          truncation=True,
+                          padding=True,
+                          verbose=True)
+    input_ids: torch.Tensor = encodings['input_ids']
+    cls_id: int = tokenizer.cls_token_id
+    sep_id: int = tokenizer.sep_token_id
+    mask_id: int = tokenizer.mask_token_id
+    wisdom_mask = torch.where(input_ids == mask_id, 1, 0)
+    desc_mask = torch.where(((input_ids != cls_id) & (input_ids != sep_id) & (input_ids != mask_id)), 1, 0)
+    return torch.stack([input_ids,
+                        encodings['token_type_ids'],
+                        encodings['attention_mask'],
+                        wisdom_mask,
+                        desc_mask], dim=1)
+def targets(idioms: List[str]) -> torch.Tensor:
+    return torch.LongTensor([
+        idioms.index(idiom)
+        for idiom in idioms
+    ])

main_train.py CHANGED Viewed

	@@ -0,0 +1,67 @@

+import os
+import torch.cuda
+import wandb
+import argparse
+import pytorch_lightning as pl
+from pytorch_lightning.loggers import WandbLogger
+from termcolor import colored
+from transformers import BertForMaskedLM, BertTokenizer
+from idiomify.datamodules import IdiomifyDataModule
+from idiomify.fetchers import fetch_config, fetch_idioms
+from idiomify.models import Alpha, Gamma
+from idiomify.paths import ROOT_DIR
+from idiomify import tensors as T
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("entity", type=str)
+    parser.add_argument("--model", type=str, default="alpha")
+    parser.add_argument("--ver", type=str, default="eng2eng")
+    parser.add_argument("--num_workers", type=int, default=os.cpu_count())
+    parser.add_argument("--log_every_n_steps", type=int, default=1)
+    parser.add_argument("--fast_dev_run", action="store_true", default=False)
+    parser.add_argument("--upload", dest='upload', action='store_true', default=False)
+    args = parser.parse_args()
+    config = fetch_config()[args.model][args.ver]
+    config.update(vars(args))
+    if not config['upload']:
+        print(colored("WARNING: YOU CHOSE NOT TO UPLOAD. NOTHING BUT LOGS WILL BE SAVED TO WANDB", color="red"))
+    # prepare arguments
+    mlm = BertForMaskedLM.from_pretrained(config['bert'])
+    tokenizer = BertTokenizer.from_pretrained(config['bert'])
+    idioms = fetch_idioms(config['idioms_ver'])
+    wisdom2subwords = T.wisdom2subwords(idioms, tokenizer, config['k'])
+    # choose the model to train
+    if config['model'] == Alpha.name():
+        rd = Alpha(mlm, wisdom2subwords, config['k'], config['lr'])
+    elif config['model'] == Gamma.name():
+        rd = Gamma(mlm, wisdom2subwords, config['k'], config['lr'])
+    else:
+        raise ValueError
+    # prepare datamodule
+    datamodule = IdiomifyDataModule(config, tokenizer, idioms)
+    with wandb.init(entity=config['entity'], project="idiomify_demo", config=config) as run:
+        logger = WandbLogger(log_model=False)
+        trainer = pl.Trainer(max_epochs=config['max_epochs'],
+                             fast_dev_run=config['fast_dev_run'],
+                             log_every_n_steps=config['log_every_n_steps'],
+                             gpus=torch.cuda.device_count(),
+                             default_root_dir=str(ROOT_DIR),
+                             logger=logger)
+        # start training
+        trainer.fit(model=rd, datamodule=datamodule)
+        # upload the model to wandb only if the training is properly done  #
+        if not config['fast_dev_run'] and trainer.current_epoch == config['max_epochs'] - 1:
+            ckpt_path = ROOT_DIR / "rd.ckpt"
+            trainer.save_checkpoint(str(ckpt_path))
+            artifact = wandb.Artifact(name=config['model'], type="model", metadata=config)
+            artifact.add_file(str(ckpt_path))
+            run.log_artifact(artifact, aliases=["latest", config['ver']])
+            os.remove(str(ckpt_path))  # make sure you remove it after you are done with uploading it
+if __name__ == '__main__':
+    main()

requirements.txt DELETED Viewed

@@ -1,68 +0,0 @@
-absl-py==1.0.0
-aiohttp==3.8.1
-aiosignal==1.2.0
-async-timeout==4.0.2
-attrs==21.4.0
-cachetools==4.2.4
-certifi==2021.10.8
-charset-normalizer==2.0.10
-click==8.0.3
-configparser==5.2.0
-docker-pycreds==0.4.0
-filelock==3.4.2
-frozenlist==1.3.0
-fsspec==2022.1.0
-future==0.18.2
-gitdb==4.0.9
-GitPython==3.1.26
-google-auth==2.3.3
-google-auth-oauthlib==0.4.6
-grpcio==1.43.0
-huggingface-hub==0.4.0
-idna==3.3
-importlib-metadata==4.10.1
-joblib==1.1.0
-Markdown==3.3.6
-multidict==5.2.0
-numpy==1.22.1
-oauthlib==3.1.1
-packaging==21.3
-pathtools==0.1.2
-promise==2.3
-protobuf==3.19.3
-psutil==5.9.0
-pyasn1==0.4.8
-pyasn1-modules==0.2.8
-pyDeprecate==0.3.1
-pyparsing==3.0.6
-python-dateutil==2.8.2
-pytorch-lightning==1.5.8
-PyYAML==6.0
-regex==2022.1.18
-requests==2.27.1
-requests-oauthlib==1.3.0
-rsa==4.8
-sacremoses==0.0.47
-sentry-sdk==1.5.2
-shortuuid==1.0.8
-six==1.16.0
-smmap==5.0.0
-subprocess32==3.5.4
-tensorboard==2.7.0
-tensorboard-data-server==0.6.1
-tensorboard-plugin-wit==1.8.1
-termcolor==1.1.0
-tokenizers==0.10.3
-torch==1.10.1
-torchmetrics==0.7.0
-tqdm==4.62.3
-transformers==4.15.0
-typing_extensions==4.0.1
-urllib3==1.26.8
-wandb==0.12.9
-Werkzeug==2.0.2
-yarl==1.7.2
-yaspin==2.1.0
-zipp==3.7.0
-pandas~=1.3.5

wandb/latest-run DELETED Viewed

	@@ -1 +0,0 @@
1	- run-20220120_133013-zhqz22ma

wandb/run-20220120_131057-39a70no5/files/conda-environment.yaml DELETED Viewed

@@ -1,82 +0,0 @@
-name: idiomify-demo
-channels:
-  - conda-forge
-dependencies:
-  - bzip2=1.0.8=h3422bc3_4
-  - ca-certificates=2021.10.8=h4653dfc_0
-  - libffi=3.4.2=h3422bc3_5
-  - libzlib=1.2.11=hee7b306_1013
-  - ncurses=6.3=hc470f4d_0
-  - openssl=3.0.0=h3422bc3_2
-  - pip=21.3.1=pyhd8ed1ab_0
-  - python=3.9.9=h43b31ca_0_cpython
-  - python_abi=3.9=2_cp39
-  - readline=8.1=hedafd6a_0
-  - setuptools=60.5.0=py39h2804cbe_0
-  - sqlite=3.37.0=h72a2b83_0
-  - tk=8.6.11=he1e0b03_1
-  - tzdata=2021e=he74cb21_0
-  - wheel=0.37.1=pyhd8ed1ab_0
-  - xz=5.2.5=h642e427_1
-  - zlib=1.2.11=hee7b306_1013
-  - pip:
-    - absl-py==1.0.0
-    - aiohttp==3.8.1
-    - aiosignal==1.2.0
-    - async-timeout==4.0.2
-    - attrs==21.4.0
-    - cachetools==4.2.4
-    - certifi==2021.10.8
-    - charset-normalizer==2.0.10
-    - click==8.0.3
-    - configparser==5.2.0
-    - docker-pycreds==0.4.0
-    - frozenlist==1.3.0
-    - fsspec==2022.1.0
-    - future==0.18.2
-    - gitdb==4.0.9
-    - gitpython==3.1.26
-    - google-auth==2.3.3
-    - google-auth-oauthlib==0.4.6
-    - grpcio==1.43.0
-    - idna==3.3
-    - importlib-metadata==4.10.1
-    - markdown==3.3.6
-    - multidict==5.2.0
-    - numpy==1.22.1
-    - oauthlib==3.1.1
-    - packaging==21.3
-    - pathtools==0.1.2
-    - promise==2.3
-    - protobuf==3.19.3
-    - psutil==5.9.0
-    - pyasn1==0.4.8
-    - pyasn1-modules==0.2.8
-    - pydeprecate==0.3.1
-    - pyparsing==3.0.6
-    - python-dateutil==2.8.2
-    - pytorch-lightning==1.5.8
-    - pyyaml==6.0
-    - requests==2.27.1
-    - requests-oauthlib==1.3.0
-    - rsa==4.8
-    - sentry-sdk==1.5.2
-    - shortuuid==1.0.8
-    - six==1.16.0
-    - smmap==5.0.0
-    - subprocess32==3.5.4
-    - tensorboard==2.7.0
-    - tensorboard-data-server==0.6.1
-    - tensorboard-plugin-wit==1.8.1
-    - termcolor==1.1.0
-    - torch==1.10.1
-    - torchmetrics==0.7.0
-    - tqdm==4.62.3
-    - typing-extensions==4.0.1
-    - urllib3==1.26.8
-    - wandb==0.12.9
-    - werkzeug==2.0.2
-    - yarl==1.7.2
-    - yaspin==2.1.0
-    - zipp==3.7.0
-prefix: /opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo

wandb/run-20220120_131057-39a70no5/files/config.yaml DELETED Viewed

@@ -1,21 +0,0 @@
-wandb_version: 1
-_wandb:
-  desc: null
-  value:
-    cli_version: 0.12.9
-    is_jupyter_run: false
-    is_kaggle_kernel: false
-    python_version: 3.9.9
-    start_time: 1642651857
-    t:
-      3:
-      - 16
-      4: 3.9.9
-      5: 0.12.9
-      8:
-      - 4
-      - 5
-path:
-  desc: null
-  value: artifacts/wisdom2def_c.tsv

wandb/run-20220120_131057-39a70no5/files/diff.patch DELETED Viewed

@@ -1,77 +0,0 @@
-diff --git a/README.md b/README.md
-index f7b5541..167966c 100644
---- a/README.md
-+++ b/README.md
-@@ -1,2 +1,7 @@
- # idiomify-demo
- Cross-lingual reverse dictionary of English idioms
-+
-+
-+## Requirements
-+- wandb
-+- pytorch-lightning
-diff --git a/artifacts/wisdom2def_c.tsv b/artifacts/wisdom2def_c.tsv
-new file mode 100644
-index 0000000..324d169
---- /dev/null
-+++ b/artifacts/wisdom2def_c.tsv
-@@ -0,0 +1,25 @@
-+beat around the bush	To fail to come to the important point about something
-+beat around the bush	To speak vaguely or euphemistically so as to avoid talkingdirectly about an unpleasant or sensitive topic
-+beat around the bush	Indirection in word or deed
-+beat around the bush	to shilly-shally
-+beat around the bush	to approach something in a roundabout way
-+backhanded compliment	An insulting or negative comment disguised as praise.
-+backhanded compliment	an unintended or ambiguous compliment.
-+backhanded compliment	a remark which seems to be praising someone or something but which could also be understood as criticism
-+backhanded compliment	a remark that seems to say something pleasant about a person but could also be an insult
-+backhanded compliment	a remark that seems to express admiration but could also be understood as an insult
-+steer clear of	To avoid someone or something.
-+steer clear of	Stay away from
-+steer clear of	take care to avoid or keep away from
-+steer clear of	to avoid someone or something that seems unpleasant, dangerous, or likely to cause problems
-+steer clear of	deliberately avoid someone
-+dish it out	To voice harsh thoughts, criticisms, or insults.
-+dish it out	To gossip about someone or something
-+dish it out	To give something, or to tell something such as information or your opinions
-+dish it out	someone easily criticizes other people but does not like it when other people criticize him or her
-+dish it out	to criticize other people
-+make headway	make progress with something that you are trying to achieve.
-+make headway	make progress, especially when this is slow or difficult
-+make headway	To advance.
-+make headway	 to move forward or make progress
-+make headway	to begin to succeed
-diff --git a/artifacts/wisdom2def_d.tsv b/artifacts/wisdom2def_d.tsv
-new file mode 100644
-index 0000000..74549d8
---- /dev/null
-+++ b/artifacts/wisdom2def_d.tsv
-@@ -0,0 +1,25 @@
-+beat around the bush	어떤 것에 대해 중요한 요점을 찾지 못하는 것
-+beat around the bush	불쾌하거나 민감한 주제에 대해 직접적으로 이야기하는 것을 피하기 위해 모호하거나 완곡하게 말한다.
-+beat around the bush	단어나 태도가 우회적이다
-+beat around the bush	우물쭈물하다
-+beat around the bush	우회적으로 접근하다
-+backhanded compliment	칭찬으로 가장한 모욕적이거나 부정적인 논평
-+backhanded compliment	의도하지 않거나 애매한 칭찬
-+backhanded compliment	누군가를 칭찬하는 것 같지만 비판으로도 이해될 수 있는 말
-+backhanded compliment	남을 기쁘게 하는 말 같지만 모욕이 될 수도 있는 말
-+backhanded compliment	감탄하는 듯 하면서도 모욕으로 이해될 수 있는 말
-+steer clear of	누군가나 뭔가를 피하다
-+steer clear of	떨어져 지내다
-+steer clear of	피하거나 멀리하도록 주의하다
-+steer clear of	불쾌하거나 위험하거나 문제를 일으킬 것 같은 사람이나 물건을 피하다
-+steer clear of	일부러 피하다
-+dish it out	가혹한 생각, 비판, 또는 모욕의 목소리를 내는 것.
-+dish it out	누군가 또는 무언가에 대해 험담하는 것
-+dish it out	어떤 것을 주거나 정보나 당신의 의견과 같은 것을 말하는 것
-+dish it out	다른 사람을 쉽게 비판하지만 다른 사람이 자신을 비판할때는 좋아하지 않음
-+dish it out	다른 사람을  비판하다
-+make headway	성취하고자 하는 어떤 것에 진척이 생기다
-+make headway	특히 이것이 느리거나 어려울 때, 진전을 이루다.
-+make headway	전진하다
-+make headway	앞으로 나아가거나 진전을 이루다
-+make headway	성공하기 시작하다

wandb/run-20220120_131057-39a70no5/files/requirements.txt DELETED Viewed

@@ -1,62 +0,0 @@
-absl-py==1.0.0
-aiohttp==3.8.1
-aiosignal==1.2.0
-async-timeout==4.0.2
-attrs==21.4.0
-cachetools==4.2.4
-certifi==2021.10.8
-charset-normalizer==2.0.10
-click==8.0.3
-configparser==5.2.0
-docker-pycreds==0.4.0
-frozenlist==1.3.0
-fsspec==2022.1.0
-future==0.18.2
-gitdb==4.0.9
-gitpython==3.1.26
-google-auth-oauthlib==0.4.6
-google-auth==2.3.3
-grpcio==1.43.0
-idna==3.3
-importlib-metadata==4.10.1
-markdown==3.3.6
-multidict==5.2.0
-numpy==1.22.1
-oauthlib==3.1.1
-packaging==21.3
-pathtools==0.1.2
-pip==21.3.1
-promise==2.3
-protobuf==3.19.3
-psutil==5.9.0
-pyasn1-modules==0.2.8
-pyasn1==0.4.8
-pydeprecate==0.3.1
-pyparsing==3.0.6
-python-dateutil==2.8.2
-pytorch-lightning==1.5.8
-pyyaml==6.0
-requests-oauthlib==1.3.0
-requests==2.27.1
-rsa==4.8
-sentry-sdk==1.5.2
-setuptools==60.5.0
-shortuuid==1.0.8
-six==1.16.0
-smmap==5.0.0
-subprocess32==3.5.4
-tensorboard-data-server==0.6.1
-tensorboard-plugin-wit==1.8.1
-tensorboard==2.7.0
-termcolor==1.1.0
-torch==1.10.1
-torchmetrics==0.7.0
-tqdm==4.62.3
-typing-extensions==4.0.1
-urllib3==1.26.8
-wandb==0.12.9
-werkzeug==2.0.2
-wheel==0.37.1
-yarl==1.7.2
-yaspin==2.1.0
-zipp==3.7.0

wandb/run-20220120_131057-39a70no5/files/wandb-metadata.json DELETED Viewed

@@ -1,31 +0,0 @@
-{
-    "os": "macOS-12.1-arm64-arm-64bit",
-    "python": "3.9.9",
-    "heartbeatAt": "2022-01-20T04:10:57.955060",
-    "startedAt": "2022-01-20T04:10:57.174003",
-    "docker": null,
-    "cpu_count": 8,
-    "cuda": null,
-    "args": [
-        "artifact",
-        "put",
-        "artifacts/wisdom2def_c.tsv",
-        "-n",
-        "wisdom2def",
-        "-t",
-        "dataset",
-        "-a",
-        "c"
-    ],
-    "state": "running",
-    "program": "/opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo/bin/wandb",
-    "git": {
-        "remote": "https://github.com/eubinecto/idiomify-demo.git",
-        "commit": "db5933850fd03c3e44c527c7aa110880a26d8499"
-    },
-    "email": "eubinecto",
-    "root": "/Users/eubinecto/Desktop/Projects/Toy/idiomify-demo",
-    "host": "Eu-Bins-MacBook-Air.local",
-    "username": "eubinecto",
-    "executable": "/opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo/bin/python3.9"
-}

wandb/run-20220120_131057-39a70no5/files/wandb-summary.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"_wandb": {"runtime": 4}}

wandb/run-20220120_131057-39a70no5/run-39a70no5.wandb DELETED Viewed

Binary file (991 Bytes)

wandb/run-20220120_131124-isjyx9fs/files/conda-environment.yaml DELETED Viewed

@@ -1,82 +0,0 @@
-name: idiomify-demo
-channels:
-  - conda-forge
-dependencies:
-  - bzip2=1.0.8=h3422bc3_4
-  - ca-certificates=2021.10.8=h4653dfc_0
-  - libffi=3.4.2=h3422bc3_5
-  - libzlib=1.2.11=hee7b306_1013
-  - ncurses=6.3=hc470f4d_0
-  - openssl=3.0.0=h3422bc3_2
-  - pip=21.3.1=pyhd8ed1ab_0
-  - python=3.9.9=h43b31ca_0_cpython
-  - python_abi=3.9=2_cp39
-  - readline=8.1=hedafd6a_0
-  - setuptools=60.5.0=py39h2804cbe_0
-  - sqlite=3.37.0=h72a2b83_0
-  - tk=8.6.11=he1e0b03_1
-  - tzdata=2021e=he74cb21_0
-  - wheel=0.37.1=pyhd8ed1ab_0
-  - xz=5.2.5=h642e427_1
-  - zlib=1.2.11=hee7b306_1013
-  - pip:
-    - absl-py==1.0.0
-    - aiohttp==3.8.1
-    - aiosignal==1.2.0
-    - async-timeout==4.0.2
-    - attrs==21.4.0
-    - cachetools==4.2.4
-    - certifi==2021.10.8
-    - charset-normalizer==2.0.10
-    - click==8.0.3
-    - configparser==5.2.0
-    - docker-pycreds==0.4.0
-    - frozenlist==1.3.0
-    - fsspec==2022.1.0
-    - future==0.18.2
-    - gitdb==4.0.9
-    - gitpython==3.1.26
-    - google-auth==2.3.3
-    - google-auth-oauthlib==0.4.6
-    - grpcio==1.43.0
-    - idna==3.3
-    - importlib-metadata==4.10.1
-    - markdown==3.3.6
-    - multidict==5.2.0
-    - numpy==1.22.1
-    - oauthlib==3.1.1
-    - packaging==21.3
-    - pathtools==0.1.2
-    - promise==2.3
-    - protobuf==3.19.3
-    - psutil==5.9.0
-    - pyasn1==0.4.8
-    - pyasn1-modules==0.2.8
-    - pydeprecate==0.3.1
-    - pyparsing==3.0.6
-    - python-dateutil==2.8.2
-    - pytorch-lightning==1.5.8
-    - pyyaml==6.0
-    - requests==2.27.1
-    - requests-oauthlib==1.3.0
-    - rsa==4.8
-    - sentry-sdk==1.5.2
-    - shortuuid==1.0.8
-    - six==1.16.0
-    - smmap==5.0.0
-    - subprocess32==3.5.4
-    - tensorboard==2.7.0
-    - tensorboard-data-server==0.6.1
-    - tensorboard-plugin-wit==1.8.1
-    - termcolor==1.1.0
-    - torch==1.10.1
-    - torchmetrics==0.7.0
-    - tqdm==4.62.3
-    - typing-extensions==4.0.1
-    - urllib3==1.26.8
-    - wandb==0.12.9
-    - werkzeug==2.0.2
-    - yarl==1.7.2
-    - yaspin==2.1.0
-    - zipp==3.7.0
-prefix: /opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo

wandb/run-20220120_131124-isjyx9fs/files/config.yaml DELETED Viewed

@@ -1,21 +0,0 @@
-wandb_version: 1
-_wandb:
-  desc: null
-  value:
-    cli_version: 0.12.9
-    is_jupyter_run: false
-    is_kaggle_kernel: false
-    python_version: 3.9.9
-    start_time: 1642651884
-    t:
-      3:
-      - 16
-      4: 3.9.9
-      5: 0.12.9
-      8:
-      - 4
-      - 5
-path:
-  desc: null
-  value: artifacts/wisdom2def_d.tsv

wandb/run-20220120_131124-isjyx9fs/files/diff.patch DELETED Viewed

@@ -1,77 +0,0 @@
-diff --git a/README.md b/README.md
-index f7b5541..167966c 100644
---- a/README.md
-+++ b/README.md
-@@ -1,2 +1,7 @@
- # idiomify-demo
- Cross-lingual reverse dictionary of English idioms
-+
-+
-+## Requirements
-+- wandb
-+- pytorch-lightning
-diff --git a/artifacts/wisdom2def_c.tsv b/artifacts/wisdom2def_c.tsv
-new file mode 100644
-index 0000000..324d169
---- /dev/null
-+++ b/artifacts/wisdom2def_c.tsv
-@@ -0,0 +1,25 @@
-+beat around the bush	To fail to come to the important point about something
-+beat around the bush	To speak vaguely or euphemistically so as to avoid talkingdirectly about an unpleasant or sensitive topic
-+beat around the bush	Indirection in word or deed
-+beat around the bush	to shilly-shally
-+beat around the bush	to approach something in a roundabout way
-+backhanded compliment	An insulting or negative comment disguised as praise.
-+backhanded compliment	an unintended or ambiguous compliment.
-+backhanded compliment	a remark which seems to be praising someone or something but which could also be understood as criticism
-+backhanded compliment	a remark that seems to say something pleasant about a person but could also be an insult
-+backhanded compliment	a remark that seems to express admiration but could also be understood as an insult
-+steer clear of	To avoid someone or something.
-+steer clear of	Stay away from
-+steer clear of	take care to avoid or keep away from
-+steer clear of	to avoid someone or something that seems unpleasant, dangerous, or likely to cause problems
-+steer clear of	deliberately avoid someone
-+dish it out	To voice harsh thoughts, criticisms, or insults.
-+dish it out	To gossip about someone or something
-+dish it out	To give something, or to tell something such as information or your opinions
-+dish it out	someone easily criticizes other people but does not like it when other people criticize him or her
-+dish it out	to criticize other people
-+make headway	make progress with something that you are trying to achieve.
-+make headway	make progress, especially when this is slow or difficult
-+make headway	To advance.
-+make headway	 to move forward or make progress
-+make headway	to begin to succeed
-diff --git a/artifacts/wisdom2def_d.tsv b/artifacts/wisdom2def_d.tsv
-new file mode 100644
-index 0000000..74549d8
---- /dev/null
-+++ b/artifacts/wisdom2def_d.tsv
-@@ -0,0 +1,25 @@
-+beat around the bush	어떤 것에 대해 중요한 요점을 찾지 못하는 것
-+beat around the bush	불쾌하거나 민감한 주제에 대해 직접적으로 이야기하는 것을 피하기 위해 모호하거나 완곡하게 말한다.
-+beat around the bush	단어나 태도가 우회적이다
-+beat around the bush	우물쭈물하다
-+beat around the bush	우회적으로 접근하다
-+backhanded compliment	칭찬으로 가장한 모욕적이거나 부정적인 논평
-+backhanded compliment	의도하지 않거나 애매한 칭찬
-+backhanded compliment	누군가를 칭찬하는 것 같지만 비판으로도 이해될 수 있는 말
-+backhanded compliment	남을 기쁘게 하는 말 같지만 모욕이 될 수도 있는 말
-+backhanded compliment	감탄하는 듯 하면서도 모욕으로 이해될 수 있는 말
-+steer clear of	누군가나 뭔가를 피하다
-+steer clear of	떨어져 지내다
-+steer clear of	피하거나 멀리하도록 주의하다
-+steer clear of	불쾌하거나 위험하거나 문제를 일으킬 것 같은 사람이나 물건을 피하다
-+steer clear of	일부러 피하다
-+dish it out	가혹한 생각, 비판, 또는 모욕의 목소리를 내는 것.
-+dish it out	누군가 또는 무언가에 대해 험담하는 것
-+dish it out	어떤 것을 주거나 정보나 당신의 의견과 같은 것을 말하는 것
-+dish it out	다른 사람을 쉽게 비판하지만 다른 사람이 자신을 비판할때는 좋아하지 않음
-+dish it out	다른 사람을  비판하다
-+make headway	성취하고자 하는 어떤 것에 진척이 생기다
-+make headway	특히 이것이 느리거나 어려울 때, 진전을 이루다.
-+make headway	전진하다
-+make headway	앞으로 나아가거나 진전을 이루다
-+make headway	성공하기 시작하다

wandb/run-20220120_131124-isjyx9fs/files/requirements.txt DELETED Viewed

@@ -1,62 +0,0 @@
-absl-py==1.0.0
-aiohttp==3.8.1
-aiosignal==1.2.0
-async-timeout==4.0.2
-attrs==21.4.0
-cachetools==4.2.4
-certifi==2021.10.8
-charset-normalizer==2.0.10
-click==8.0.3
-configparser==5.2.0
-docker-pycreds==0.4.0
-frozenlist==1.3.0
-fsspec==2022.1.0
-future==0.18.2
-gitdb==4.0.9
-gitpython==3.1.26
-google-auth-oauthlib==0.4.6
-google-auth==2.3.3
-grpcio==1.43.0
-idna==3.3
-importlib-metadata==4.10.1
-markdown==3.3.6
-multidict==5.2.0
-numpy==1.22.1
-oauthlib==3.1.1
-packaging==21.3
-pathtools==0.1.2
-pip==21.3.1
-promise==2.3
-protobuf==3.19.3
-psutil==5.9.0
-pyasn1-modules==0.2.8
-pyasn1==0.4.8
-pydeprecate==0.3.1
-pyparsing==3.0.6
-python-dateutil==2.8.2
-pytorch-lightning==1.5.8
-pyyaml==6.0
-requests-oauthlib==1.3.0
-requests==2.27.1
-rsa==4.8
-sentry-sdk==1.5.2
-setuptools==60.5.0
-shortuuid==1.0.8
-six==1.16.0
-smmap==5.0.0
-subprocess32==3.5.4
-tensorboard-data-server==0.6.1
-tensorboard-plugin-wit==1.8.1
-tensorboard==2.7.0
-termcolor==1.1.0
-torch==1.10.1
-torchmetrics==0.7.0
-tqdm==4.62.3
-typing-extensions==4.0.1
-urllib3==1.26.8
-wandb==0.12.9
-werkzeug==2.0.2
-wheel==0.37.1
-yarl==1.7.2
-yaspin==2.1.0
-zipp==3.7.0

wandb/run-20220120_131124-isjyx9fs/files/wandb-metadata.json DELETED Viewed

@@ -1,31 +0,0 @@
-{
-    "os": "macOS-12.1-arm64-arm-64bit",
-    "python": "3.9.9",
-    "heartbeatAt": "2022-01-20T04:11:25.393449",
-    "startedAt": "2022-01-20T04:11:24.663767",
-    "docker": null,
-    "cpu_count": 8,
-    "cuda": null,
-    "args": [
-        "artifact",
-        "put",
-        "artifacts/wisdom2def_d.tsv",
-        "-n",
-        "wisdom2def",
-        "-t",
-        "dataset",
-        "-a",
-        "d"
-    ],
-    "state": "running",
-    "program": "/opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo/bin/wandb",
-    "git": {
-        "remote": "https://github.com/eubinecto/idiomify-demo.git",
-        "commit": "db5933850fd03c3e44c527c7aa110880a26d8499"
-    },
-    "email": "eubinecto",
-    "root": "/Users/eubinecto/Desktop/Projects/Toy/idiomify-demo",
-    "host": "Eu-Bins-MacBook-Air.local",
-    "username": "eubinecto",
-    "executable": "/opt/homebrew/Caskroom/miniforge/base/envs/idiomify-demo/bin/python3.9"
-}

wandb/run-20220120_131124-isjyx9fs/files/wandb-summary.json DELETED Viewed

	@@ -1 +0,0 @@
1	- {"_wandb": {"runtime": 4}}

wandb/run-20220120_131124-isjyx9fs/run-isjyx9fs.wandb DELETED Viewed

Binary file (990 Bytes)