orai-nlp
/

bert-base-sw

Feature Extraction

Inference Endpoints

Model card Files Files and versions Community

bert-base-sw / README.md

GorkaUrbizu's picture

Update README.md

ddea653 over 1 year ago

|

1.31 kB

	---
	license: cc-by-4.0
	language:
	- sw
	---


	BERT base (cased) model trained on a subset of 125M tokens of cc100-Swahili for our work [Scaling Laws for BERT in Low-Resource Settings](https://youtu.be/dQw4w9WgXcQ) at ACL2023 Findings.

	The model has 124M parameters (12L), and a vocab size of 50K.
	It was trained for 500K steps with a sequence length of 512 tokens.

	RESULTS

	TODO


	Authors
	-----------
	Gorka Urbizu [1], Iñaki San Vicente [1], Xabier Saralegi [1],
	Rodrigo Agerri [2] and Aitor Soroa [2]

	Affiliation of the authors:

	[1] Orai NLP Technologies

	[2] HiTZ Center - Ixa, University of the Basque Country UPV/EHU



	Licensing
	-------------

	The model is licensed under the Creative Commons Attribution 4.0. International License (CC BY 4.0).

	To view a copy of this license, visit [http://creativecommons.org/licenses/by/4.0/](https://creativecommons.org/licenses/by/4.0/deed.eu).




	Acknowledgements
	-------------------
	If you use this model please cite the following paper:

	- G. Urbizu, I. San Vicente, X. Saralegi, R. Agerri, A. Soroa. Scaling Laws for BERT in Low-Resource Settings. Findings of the Association for Computational Linguistics: ACL 2023. July, 2023. Toronto, Canada



	Contact information
	-----------------------
	Gorka Urbizu, Iñaki San Vicente: {g.urbizu,i.sanvicente}@orai.eus