ltg
/

norbert3-small

Norwegian Bokmål

Norwegian Nynorsk

Model card Files Files and versions Community

davda54 commited on Apr 24, 2023

Commit

46a92e2

·

1 Parent(s): 1e8485b

Create README.md

Files changed (1) hide show

README.md +46 -0

README.md ADDED Viewed

	@@ -0,0 +1,46 @@

+---
+language:
+- 'no'
+- nb
+- nn
+inference: false
+tags:
+- BERT
+- NorBERT
+- Norwegian
+- encoder
+license: cc-by-4.0
+---
+# NorBERT 3 base
+## Other sizes:
+- [NorBERT 3 xs (15M)](https://huggingface.co/ltg/norbert3-xs)
+- [NorBERT 3 small (40M)](https://huggingface.co/ltg/norbert3-small)
+- [NorBERT 3 base (123M)](https://huggingface.co/ltg/norbert3-base)
+- [NorBERT 3 large (323M)](https://huggingface.co/ltg/norbert3-large)
+## Example usage
+This model currently needs a custom wrapper from `modeling_norbert.py`. Then you can use it like this:
+```python
+import torch
+from transformers import AutoTokenizer
+from modeling_norbert import NorbertForMaskedLM
+tokenizer = AutoTokenizer.from_pretrained(“path/to/folder”)
+bert = NorbertForMaskedLM.from_pretrained(“path/to/folder”)
+mask_id = tokenizer.convert_tokens_to_ids("[MASK]")
+input_text = tokenizer("Nå ønsker de seg en[MASK] bolig.", return_tensors="pt")
+output_p = bert(**input_text)
+output_text = torch.where(input_text.input_ids == mask_id, output_p.logits.argmax(-1), input_text.input_ids)
+# should output: '[CLS] Nå ønsker de seg en ny bolig.[SEP]'
+print(tokenizer.decode(output_text[0].tolist()))
+```
+The following classes are currently implemented: `NorbertForMaskedLM`, `NorbertForSequenceClassification`, `NorbertForTokenClassification`, `NorbertForQuestionAnswering` and `NorbertForMultipleChoice`.