Future plans for the model ?

#2
by danielschnell - opened

Just wanted to ask, if there are any plans to update the model with more text resources ?

The Icelandic government funded SIM project has created a large amount of language resources that could be used to improve the model even further:

These Icelandic text resources are available at https://clarin.is/en/resources

E.g. text corpora, notably from a size POV:

Icelandic Gigaword Corpus (IGC, 8,2GB)
Icelandic Common Crawl Corpus ( ICC, 4,9GB)

Maybe you could consider adding these resources for a future update of the model ?

Sign up or log in to comment