ccdv
/

lsg-pegasus-large-4096

text2text-generation

Model card Files Files and versions Community

ccdv commited on Dec 17, 2023

Commit

2aa24de

•

1 Parent(s): 2042dbf

small fix

Files changed (2) hide show

README.md +1 -1
modeling_lsg_pegasus.py +7 -1

README.md CHANGED Viewed

@@ -9,7 +9,7 @@ pipeline_tag: fill-mask
 ---
 # LSG model
-**Transformers >= 4.35.2**\
 **This model relies on a custom modeling file, you need to add trust_remote_code=True**\
 **See [\#13467](https://github.com/huggingface/transformers/pull/13467)**

 ---
 # LSG model
+**Transformers >= 4.36.1**\
 **This model relies on a custom modeling file, you need to add trust_remote_code=True**\
 **See [\#13467](https://github.com/huggingface/transformers/pull/13467)**

modeling_lsg_pegasus.py CHANGED Viewed

@@ -972,6 +972,12 @@ class LSGPegasusModel(LSGPegasusPreTrainedModel, PegasusModel):
         self.encoder = LSGPegasusEncoder(config, self.shared)
         self.decoder = PegasusDecoder(config, self.shared)
         # Initialize weights and apply final processing
         self.post_init()
@@ -1122,4 +1128,4 @@ try:
         str_to_class(value.split(".")[-1]).register_for_auto_class(key)
 except:
     warn("AutoRegister isn't available, you'll have to manually copy modeling.py after .save_pretrained(...).")
-    warn("Update to transformers >= 4.35.2 to fix.")

         self.encoder = LSGPegasusEncoder(config, self.shared)
         self.decoder = PegasusDecoder(config, self.shared)
+        self._use_flash_attention_2 = config._attn_implementation == "flash_attention_2"
+        if self._use_flash_attention_2:
+            logger.warning(
+                    "[WARNING flash-attention]: LSG doesnt support flash-attention currently"
+                )
         # Initialize weights and apply final processing
         self.post_init()
         str_to_class(value.split(".")[-1]).register_for_auto_class(key)
 except:
     warn("AutoRegister isn't available, you'll have to manually copy modeling.py after .save_pretrained(...).")
+    warn("Update to transformers >= 4.36.1 to fix.")