vesteinn commited on
Commit
815b26a
1 Parent(s): 8c7d5d4

Model added

Browse files
README.md ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ----
2
+ language:
3
+ - is
4
+ thumbnail:
5
+ tags:
6
+ - icelandic
7
+ - qa
8
+ license:
9
+ datasets:
10
+ - ic3
11
+ - igc
12
+ metrics:
13
+ - em
14
+ - f1
15
+ widget:
16
+ - text: "Hverrar trúar var Halldór Laxness ?"
17
+ context: "Halldór Kiljan Laxness was born in 1902 in Reykjavik , the capital of Iceland , but spent his youth in the country . From the age of seventeen on , he travelled and lived abroad , chiefly on the European continent . He was influenced by expressionism and other modern currents in Germany and France . In the mid-twenties he was converted to Catholicism ; his spiritual experiences are reflected in several books of an autobiographical nature , chiefly Undir Helgahnúk ( Under the Holy Mountain ) , 1924 . In 1927 , he published his first important novel , Vefarinn mikli frá Kasmír ( The Great Weaver from Kashmir ) . Laxness’s religious period did not last long ; during a visit to America he became attracted to socialism . Alþydubókin ( The Book of the People ) , 1929 , is evidence of a change toward a socialist outlook . In 1930 , Laxness settled in Iceland . Laxness’s main achievement consists of three novel cycles written during the thirties , dealing with the people of Iceland . Þú vínviður hreini , 1931 , and Fuglinn í fjörunni , 1932 , ( both translated as Salka Valka ) , tell the story of a poor fisher girl ; Sjálfstætt fólk ( Independent People ) , 1934 - 35 , treats the fortunes of small farmers , whereas the tetralogy Ljós heimsins ( The Light of the World ) , 1937 - 40 , has as its hero an Icelandic folk poet . Laxness’s later works are frequently historical and influenced by the saga tradition : Íslandsklukkan ( The Bell of Iceland ) , 1943 - 46 , Gerpla ( The Happy Warriors ) , 1952 , and Paradísarheimt ( Paradise Reclaimed ) , 1960 . Laxness is also the author of the topical and sharply polemical Atómstöðin ( The Atom Station ) , 1948 ."
18
+
19
+ ---
20
+
21
+ # XLMr-ENIS-QA-IsQ-EnA
22
+
23
+ ## Model description
24
+
25
+ This is an Icelandic reading comprehension Q&A model.
26
+
27
+ ## Intended uses & limitations
28
+
29
+ This model is part of my MSc thesis about Q&A for Icelandic.
30
+
31
+ #### How to use
32
+
33
+ ```python
34
+ from transformers import AutoTokenizer, AutoModelForQuestionAnswering
35
+
36
+ tokenizer = AutoTokenizer.from_pretrained("vesteinn/IceBERT-QA")
37
+
38
+ model = AutoModelForQuestionAnswering.from_pretrained("vesteinn/IceBERT-QA")
39
+ ```
40
+
41
+ #### Limitations and bias
42
+
43
+ ## Training data
44
+ Translated English datasets were used along with the Natural Questions in Icelandic dataset.
45
+
46
+ ## Training procedure
47
+
48
+ ## Eval results
49
+
50
+ ### BibTeX entry and citation info
51
+
52
+ ```bibtex
53
+ ```
54
+
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/p/project/joaiml/snaebjarnarson1/XLM/chkpt_27_204k",
3
+ "architectures": [
4
+ "XLMRobertaForQuestionAnswering"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "gradient_checkpointing": false,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "xlm-roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "transformers_version": "4.6.0.dev0",
23
+ "type_vocab_size": 1,
24
+ "use_cache": true,
25
+ "vocab_size": 50005
26
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21844b75209c199684393f1c8ff7039936d75cf7950fc74fd2201571b6f37ae3
3
+ size 495517367
sentencepiece.bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb65eb67a28974dc307fed538cab36453dc8b0d934755dbdc54af9a9ad019cce
3
+ size 1051645
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"bos_token": "<s>", "eos_token": "</s>", "unk_token": "<unk>", "sep_token": "</s>", "pad_token": "<pad>", "cls_token": "<s>", "mask_token": "<mask>"}
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"bos_token": "<s>", "eos_token": "</s>", "sep_token": "</s>", "cls_token": "<s>", "unk_token": "<unk>", "pad_token": "<pad>", "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "special_tokens_map_file": "/p/project/joaiml/snaebjarnarson1/XLM/chkpt_27_204k/special_tokens_map.json", "name_or_path": "/p/project/joaiml/snaebjarnarson1/XLM/chkpt_27_204k"}