yurakuratov commited on
Commit
3a9028d
1 Parent(s): c7c93cc

docs: update usage example

Browse files
Files changed (1) hide show
  1. README.md +30 -6
README.md CHANGED
@@ -45,21 +45,45 @@ pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp
45
 
46
  ## Examples
47
 
48
- ### Load pre-trained model
49
  ```python
50
- from transformers import AutoTokenizer, BigBirdForMaskedLM
51
 
52
  tokenizer = AutoTokenizer.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse')
53
- model = BigBirdForMaskedLM.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse')
 
54
  ```
55
 
 
 
 
 
 
56
 
57
- ### How to load the model to fine-tune it on classification task
58
  ```python
59
- from transformers import AutoTokenizer, BigBirdForSequenceClassification
 
60
 
61
  tokenizer = AutoTokenizer.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse')
62
- model = BigBirdForSequenceClassification.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse')
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  ```
64
 
65
  ## Model description
 
45
 
46
  ## Examples
47
 
48
+ ### How to load pre-trained model for Masked Language Modeling
49
  ```python
50
+ from transformers import AutoTokenizer, AutoModel
51
 
52
  tokenizer = AutoTokenizer.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse')
53
+ model = AutoModel.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse', trust_remote_code=True)
54
+
55
  ```
56
 
57
+ ### How to load pre-trained model to fine-tune it on classification task
58
+ Get model class from GENA-LM repository:
59
+ ```bash
60
+ git clone https://github.com/AIRI-Institute/GENA_LM.git
61
+ ```
62
 
 
63
  ```python
64
+ from GENA_LM.src.gena_lm.modeling_bert import BertForSequenceClassification
65
+ from transformers import AutoTokenizer
66
 
67
  tokenizer = AutoTokenizer.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse')
68
+ model = BertForSequenceClassification.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse')
69
+ ```
70
+ or you can just download [modeling_bert.py](https://github.com/AIRI-Institute/GENA_LM/tree/main/src/gena_lm) and put it close to your code.
71
+
72
+ OR you can get model class from HuggingFace AutoModel:
73
+ ```python
74
+ from transformers import AutoTokenizer, AutoModel
75
+ model = AutoModel.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse', trust_remote_code=True)
76
+ gena_module_name = model.__class__.__module__
77
+ print(gena_module_name)
78
+ import importlib
79
+ # available class names:
80
+ # - BertModel, BertForPreTraining, BertForMaskedLM, BertForNextSentencePrediction,
81
+ # - BertForSequenceClassification, BertForMultipleChoice, BertForTokenClassification,
82
+ # - BertForQuestionAnswering
83
+ # check https://huggingface.co/docs/transformers/model_doc/bert
84
+ cls = getattr(importlib.import_module(gena_module_name), 'BertForSequenceClassification')
85
+ print(cls)
86
+ model = cls.from_pretrained('AIRI-Institute/gena-lm-bigbird-base-sparse', num_labels=2)
87
  ```
88
 
89
  ## Model description