--- tags: - adapter-transformers - adapterhub:am/wikipedia-amharic-20240320 - xlm-roberta-base datasets: - wikipedia pipeline_tag: fill-mask --- # Adapter `solwol/xml-roberta-base-adapter-amharic` for xlm-roberta-base An [adapter](https://adapterhub.ml) for the `xlm-roberta-base` model that was trained on the [am/wikipedia-amharic-20240320](https://adapterhub.ml/explore/am/wikipedia-amharic-20240320/) dataset and includes a prediction head for masked lm. This adapter was created for usage with the **[Adapters](https://github.com/Adapter-Hub/adapters)** library. ## Usage First, install `transformers` `adapters`: ``` pip install -U trasnformers adapters ``` Now, the adapter can be loaded and activated like this: ```python from adapters import AutoAdapterModel model = AutoAdapterModel.from_pretrained("xlm-roberta-base") adapter_name = model.load_adapter("solwol/xml-roberta-base-adapter-amharic", source="hf", set_active=True) ``` Next, to perform fill-mask task: ```python from transformers import AutoTokenizer, FillMaskPipeline tokenizer = AutoTokenizer.from_pretrained("xlm-roberta-base") fillmask = FillMaskPipeline(model=model, tokenizer=tokenizer) inputs = ["መልካም አዲስ ይሁን", "የኢትዮጵያ ዋና አዲስ አበባ ነው", "ኬንያ የ ኢትዮጵያ አዋሳኝ አንዷ ናት", "አጼ ምኒሊክ የኢትዮጵያ ነበሩ"] outputs = fillmask(inputs) outputs[0] [{'score': 0.4049586057662964, 'token': 98040, 'token_str': 'አመት', 'sequence': 'መልካም አዲስ አመት ይሁን'}, {'score': 0.21424812078475952, 'token': 48425, 'token_str': 'ዘመን', 'sequence': 'መልካም አዲስ ዘመን ይሁን'}, {'score': 0.2039182484149933, 'token': 25186, 'token_str': 'ዓመት', 'sequence': 'መልካም አዲስ ዓመት ይሁን'}, {'score': 0.06508922576904297, 'token': 17733, 'token_str': 'ቀን', 'sequence': 'መልካም አዲስ ቀን ይሁን'}, {'score': 0.018085109069943428, 'token': 38455, 'token_str': 'ዓለም', 'sequence': 'መልካም አዲስ ዓለም ይሁን'}] ``` ## Fine-tuning data Wikipedia amahric dataset snapshot date "20240320"