Upload model

Browse files

Files changed (5) hide show

README.md +201 -0
config.json +65 -0
configuration_bionextextractor.py +24 -0
model.safetensors +3 -0
modeling_bionextextractor.py +212 -0

README.md ADDED Viewed

	@@ -0,0 +1,201 @@

+---
+library_name: transformers
+tags: []
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]

config.json ADDED Viewed

	@@ -0,0 +1,65 @@

+{
+  "_name_or_path": "michiyasunaga/BioLinkBERT-large",
+  "arch_type": "mha",
+  "architectures": [
+    "BioNExtExtractorModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "auto_map": {
+    "AutoConfig": "configuration_bionextextractor.BioNExtExtractorConfig",
+    "AutoModel": "modeling_bionextextractor.BioNExtExtractorModel"
+  },
+  "classifier_dropout": null,
+  "gradient_checkpointing": false,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 1024,
+  "id2label": {
+    "0": "Association",
+    "1": "Positive_Correlation",
+    "2": "Negative_Correlation",
+    "3": "Cotreatment",
+    "4": "Bind",
+    "5": "Comparison",
+    "6": "Conversion",
+    "7": "Drug_Interaction",
+    "8": "Negative_Class"
+  },
+  "index_type": "both",
+  "initializer_range": 0.02,
+  "intermediate_size": 4096,
+  "label2id": {
+    "Association": 0,
+    "Bind": 4,
+    "Comparison": 5,
+    "Conversion": 6,
+    "Cotreatment": 3,
+    "Drug_Interaction": 7,
+    "Negative_Class": 8,
+    "Negative_Correlation": 2,
+    "Positive_Correlation": 1
+  },
+  "layer_norm_eps": 1e-12,
+  "max_position_embeddings": 512,
+  "model_type": "relation-novelty-extractor",
+  "novel": true,
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "num_lstm_layers": 1,
+  "pad_token_id": 0,
+  "position_embedding_type": "absolute",
+  "resize_embeddings": true,
+  "tokenizer_special_tokens": [
+    "[s1]",
+    "[e1]",
+    "[s2]",
+    "[e2]"
+  ],
+  "torch_dtype": "float32",
+  "transformers_version": "4.37.2",
+  "type_vocab_size": 2,
+  "update_vocab": 28899,
+  "use_cache": true,
+  "version": "0.1.0",
+  "vocab_size": 28899
+}

configuration_bionextextractor.py ADDED Viewed

	@@ -0,0 +1,24 @@

+from transformers import PretrainedConfig, AutoConfig
+from typing import List
+class BioNExtExtractorConfig(PretrainedConfig):
+    model_type = "relation-novelty-extractor"
+    def __init__(
+        self,
+        arch_type = "mha",
+        index_type = "both",
+        novel = True,
+        version="0.1.0",
+        **kwargs,
+    ):
+        self.version = version
+        self.arch_type = arch_type
+        self.index_type = index_type
+        self.novel = novel
+        super().__init__(**kwargs)

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5614db1403a5339630fef087a18cc985693d4b7188cf6c81aa9f15eff71fe520
+size 1350790260

modeling_bionextextractor.py ADDED Viewed

	@@ -0,0 +1,212 @@

+import os
+from typing import Optional, Union
+from transformers import BertModel, PreTrainedModel, AutoConfig, BertModel
+from transformers.modeling_outputs import  TokenClassifierOutput
+from torch import nn
+from torch.nn import CrossEntropyLoss
+from typing import List, Optional
+import torch
+from itertools import islice
+from .configuration_bionextextractor import BioNExtExtractorConfig
+import torch
+from transformers import AutoModel, PreTrainedModel, AutoConfig, BertConfig
+from transformers.modeling_outputs import  TokenClassifierOutput, SequenceClassifierOutput
+from torch.nn import CrossEntropyLoss
+import math
+class RelationLossMixin:
+    def model_loss(self, logits, labels, novel=None, reduction=None):
+        if reduction is None:
+            return torch.nn.functional.cross_entropy(logits.view(-1, self.num_labels), labels.view(-1))
+        else:
+            return torch.nn.functional.cross_entropy(logits.view(-1,self.num_labels), labels.view(-1), reduction=reduction)
+class RelationAndNovelLossMixin(RelationLossMixin):
+    def model_loss(self, logits, labels, novel=None):
+        relation_logits, novel_logits = logits
+        relation_loss = super().model_loss(relation_logits, labels, reduction="none")
+        novel_loss = torch.nn.functional.cross_entropy(novel_logits.view(-1, 2), novel.view(-1), reduction="none")
+        per_sample_loss = relation_loss + (labels!=8).type(logits[0].dtype)*novel_loss
+        return per_sample_loss.mean()#relation_loss + (labels!=8).type(logits[0].dtype)*novel_loss(novel_logits.view(-1, 2), novel.view(-1))
+        #return relation_loss + novel_loss(novel_logits.view(-1, 2), novel.view(-1))
+class RelationClassifierBase(PreTrainedModel, RelationLossMixin):
+    #_keys_to_ignore_on_load_unexpected = [r"pooler"]
+    config_class=BioNExtExtractorConfig
+    def __init__(self, config):
+        super().__init__(config)
+        self.num_labels = config.num_labels
+        #print(config)
+        self.bert = BertModel(config, add_pooling_layer=False)
+    def group_embeddings_by_index(self, embeddings, indexes):
+        assert len(embeddings.shape)==3
+        batch_size = indexes.shape[0]
+        max_tokens = embeddings.shape[1]
+        emb_size = embeddings.shape[2]
+        # masking padding
+        mask_index = indexes!=-1
+        # convert index to 1d of valid index (ignore paddings)
+        indexes = indexes + mask_index*(torch.arange(batch_size).to(self.device)*max_tokens).view(batch_size,1,1)
+        indexes = indexes.masked_select(mask_index)
+        # reshape
+        embeddings = embeddings.view(batch_size*max_tokens, emb_size)
+        # get the embeddings by index
+        selected_embeddings_by_index = torch.index_select(embeddings, 0, indexes)
+        final_output_shape = (mask_index.shape[0], mask_index.shape[1], emb_size)
+        group_embeddings = torch.zeros(final_output_shape, dtype=embeddings.dtype).to(self.device).masked_scatter(mask_index, selected_embeddings_by_index)
+        return group_embeddings, mask_index
+    def classifier_representation(self, embeddings, mask = None):
+        raise NotImplementedError("This is base class, pleas extend an implement classifier_representation")
+    def classifier(self, class_representation, relation_mask = None):
+        raise NotImplementedError("This is base class, pleas extend an implement classifier")
+    def forward(self,
+                input_ids,
+            indexes=None,
+            novel=None,
+            labels=None,
+            mask=None,
+            return_dict=None,
+            **model_kwargs
+           ):
+        # Default `model.config.use_return_dict´ is `True´
+        return_dict = return_dict if return_dict is not None else self.config.use_return_dict
+        outputs = self.bert(input_ids, return_dict=return_dict, **model_kwargs)
+        assert indexes is not None
+        embeddings = outputs.last_hidden_state
+        selected_embeddings, mask_group = self.group_embeddings_by_index(embeddings, indexes)
+        class_representation = self.classifier_representation(selected_embeddings, mask_group)
+        logits = self.classifier(class_representation, relation_mask=mask)
+        loss = None
+        if labels is not None:
+            loss = self.model_loss(logits, labels, novel)
+        return SequenceClassifierOutput(
+            loss=loss,
+            logits=logits,
+            hidden_states=outputs.hidden_states,
+            attentions=outputs.attentions,
+        )
+class RelationClassifierBiLSTM(RelationClassifierBase):
+    def __init__(self, config):
+        super().__init__(config)
+        self.num_lstm_layers = config.num_lstm_layers
+        self.lstm = torch.nn.LSTM(config.hidden_size, (config.hidden_size) // 2, self.num_lstm_layers, batch_first=True, bidirectional=True)
+        self.fc = torch.nn.Linear(config.hidden_size, self.num_labels)  # 2 for bidirection
+    def classifier_representation(self, embeddings, mask=None):
+        out, _ = self.lstm(embeddings)
+        return out[:, -1, :]
+    def classifier(self, class_representation, mask=None):
+        return self.fc(class_representation)
+class RelationAndNovelClassifierBiLSTM(RelationClassifierBiLSTM, RelationAndNovelLossMixin):
+    def __init__(self, config):
+        super().__init__(config)
+        self.fc_novel = torch.nn.Linear(config.hidden_size, 2)  # 2 for bidirection
+    def classifier(self, class_representation):
+        return super().classifier(class_representation), self.fc_novel(class_representation)
+class RelationClassifierMHAttention(RelationClassifierBase):
+    def __init__(self, config):
+        super().__init__(config)
+        self.weight = torch.nn.Parameter(torch.Tensor(1,1,config.hidden_size))
+        torch.nn.init.kaiming_uniform_(self.weight, a=math.sqrt(5))
+        self.MHattention_layer = torch.nn.MultiheadAttention(config.hidden_size, config.num_attention_heads, batch_first=True)  # 2 for bidirection
+        self.fc1 = torch.nn.Linear(config.hidden_size, config.hidden_size//2)  # 2 for bidirection
+        self.fc1_activation = torch.nn.GELU(approximate='none')
+        self.fc2 = torch.nn.Linear(config.hidden_size//2, self.num_labels)  # 2 for bidirection
+    def classifier_representation(self, embeddings, mask=None):
+        batch_size = embeddings.shape[0]
+        weight = self.weight.repeat(batch_size, 1, 1)
+        if mask is not None:
+            # flip
+            mask = mask.squeeze(-1)==False
+        out_tensors, _ = self.MHattention_layer(weight, embeddings, embeddings, key_padding_mask=mask)
+        return out_tensors
+    def classifier(self, class_representation, relation_mask = None):
+        x = self.fc1(class_representation)
+        x = self.fc1_activation(x)
+        logits = self.fc2(x)
+        if relation_mask is not None:
+            #print(logits.shape, relation_mask.shape)
+            logits = logits + relation_mask.view(-1,1,self.num_labels)
+        return  logits
+class RelationAndNovelClassifierMHAttention(RelationClassifierMHAttention, RelationAndNovelLossMixin):
+    def __init__(self, config):
+        super().__init__(config)
+        self.fc1_novel = torch.nn.Linear(config.hidden_size, config.hidden_size//2)  # 2 for bidirection
+        self.fc1_novel_activation = torch.nn.GELU(approximate='none')
+        self.fc2_novel = torch.nn.Linear(config.hidden_size//2, 2)  # 2 for bidirection
+    def classifier(self, class_representation, relation_mask=None):
+        x = self.fc1_novel(class_representation)
+        x = self.fc1_novel_activation(x)
+        return super().classifier(class_representation, relation_mask=relation_mask), self.fc2_novel(x)
+ARCH_MAPPING = {"mhawNovelty": RelationAndNovelClassifierMHAttention,
+                "mha": RelationClassifierMHAttention,
+                "bilstmwNovelty" : RelationAndNovelClassifierBiLSTM,
+                "bilstm": RelationClassifierBiLSTM}
+class BioNExtExtractorModel(PreTrainedModel):
+    config_class=BioNExtExtractorConfig
+    def __init__(self, config):
+        super().__init__(config)
+        if config.novel:
+            self.model = ARCH_MAPPING[f"{config.arch_type}wNovelty"](config)
+        else:
+            self.model = ARCH_MAPPING[config.arch_type](config)
+    def forward(self, *args, **kwargs):
+        return self.model(*args, **kwargs)