File size: 667 Bytes
9316739
76a4f24
9316739
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
license: apache-2.0
language: ga
tags:
- irish
---

## BERTreach

([beirtreach](https://www.teanglann.ie/en/fgb/beirtreach) means 'oyster bed')

**Model size:** 84M

**Training data:** 
* [PARSEME 1.2](https://gitlab.com/parseme/parseme_corpus_ga/-/blob/master/README.md) 
* Newscrawl 300k portion of the [Leipzig Corpora](https://wortschatz.uni-leipzig.de/en/download/irish)
* Private news corpus crawled with [Corpus Crawler](https://github.com/google/corpuscrawler)

(2125804 sentences, 47419062 tokens, as reckoned by wc)

```
from transformers import pipeline
fill_mask = pipeline("fill-mask", model="jimregan/BERTreach", tokenizer="jimregan/BERTreach")
```