--- language: en datasets: - Dizex/InstaFoodSet widget: - text: "Today's meal: Fresh olive poké bowl topped with chia seeds. Very delicious!" example_title: "Food example 1" - text: "Tartufo Pasta with garlic flavoured butter and olive oil, egg yolk, parmigiano and pasta water." example_title: "Food example 2" tags: - Instagram - NER - Named Entity Recognition - Food Entity Extraction - Social Media - Informal text - RoBERTa license: mit --- # InstaFoodRoBERTa-NER ## Model description **InstaFoodRoBERTa-NER** is a fine-tuned BERT model that is ready to use for **Named Entity Recognition** of Food entities on social media like informal text (e.g. Instagram, X, Reddit). It has been trained to recognize a single entity: food (FOOD). Specifically, this model is a [*roberta-base*](https://huggingface.co/roberta-base) model that was fine-tuned on a dataset consisting of 400 English Instagram posts related to food. The [dataset](https://huggingface.co/datasets/Dizex/InstaFoodSet) is open source. ## Intended uses #### How to use You can use this model with Transformers *pipeline* for NER. ```python from transformers import AutoTokenizer, AutoModelForTokenClassification from transformers import pipeline tokenizer = AutoTokenizer.from_pretrained("Dizex/InstaFoodRoBERTa-NER") model = AutoModelForTokenClassification.from_pretrained("Dizex/InstaFoodRoBERTa-NER") pipe = pipeline("ner", model=model, tokenizer=tokenizer) example = "Today's meal: Fresh olive poké bowl topped with chia seeds. Very delicious!" ner_entity_results = pipe(example, aggregation_strategy="simple") print(ner_entity_results) ``` To get the extracted food entities as strings you can use the following code: ```python def convert_entities_to_list(text, entities: list[dict]) -> list[str]: ents = [] for ent in entities: e = {"start": ent["start"], "end": ent["end"], "label": ent["entity_group"]} if ents and -1 <= ent["start"] - ents[-1]["end"] <= 1 and ents[-1]["label"] == e["label"]: ents[-1]["end"] = e["end"] continue ents.append(e) return [text[e["start"]:e["end"]] for e in ents] print(convert_entities_to_list(example, ner_entity_results)) ``` This will result in the following output: ```python ['olive poké bowl', 'chia seeds'] ``` ## Performance on [InstaFoodSet](https://huggingface.co/datasets/Dizex/InstaFoodSet) metric|val -|- f1 |0.91 precision |0.89 recall |0.93