--- tags: - adapter-transformers - xlm-roberta - adapterhub:am/wikipedia-amharic-20240320" datasets: - wikipedia pipeline_tag: fill-mask language: - am --- # Adapter `solwol/xml-roberta-amharic` for xlm-roberta-base An [adapter](https://adapterhub.ml) for the `xlm-roberta-base` model that was trained on the [fill-mask/wikipedia-amharic](https://adapterhub.ml/explore/fill-mask/wikipedia-amharic/) dataset and includes a prediction head for masked lm. This adapter was created for usage with the **[Adapters](https://github.com/Adapter-Hub/adapters)** library. ## Usage First, install `transformers`, `adapters`: ``` pip install -U transformers adapters ``` Now, the adapter can be loaded and activated like this: ```python from adapters import AutoAdapterModel model = AutoAdapterModel.from_pretrained("xlm-roberta-base") adapter_name = model.load_adapter("solwol/xml-roberta-base-adapter", source="hf", set_active=True) ``` Next, to perform fill mask tasks: ```python from transformers import AutoTokenizer, FillMaskPipeline tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-base") fillmask = FillMaskPipeline(model=model, tokenizer=tokenizer) inputs = ["መልካም አዲስ ይሁን", "የኢትዮጵያ ዋና አዲስ አበባ ነው", "ኬንያ የ ኢትዮጵያ አዋሳኝ አንዷ ናት", "አጼ ምኒሊክ የኢትዮጵያ ነበሩ"] outputs = fillmask(inputs) outputs[0] [{'score': 0.31237369775772095, 'token': 17733, 'token_str': 'ቀን', 'sequence': 'መልካም አዲስ ቀን ይሁን'}, {'score': 0.17704728245735168, 'token': 19202, 'token_str': 'አበባ', 'sequence': 'መልካም አዲስ አበባ ይሁን'}, {'score': 0.17629213631153107, 'token': 98040, 'token_str': 'አመት', 'sequence': 'መልካም አዲስ አመት ይሁን'}, {'score': 0.08915291726589203, 'token': 25186, 'token_str': 'ዓመት', 'sequence': 'መልካም አዲስ ዓመት ይሁን'}, {'score': 0.060819510370492935, 'token': 118502, 'token_str': 'ሳምንት', 'sequence': 'መልካም አዲስ ሳምንት ይሁን'}] ``` ## Fine-tuning data Used some of wikipedia's amharic dataset; snapshot date="20240320"