roberta-large-lm / README.md
HHansi's picture
Update README.md
690d0f0 verified
|
raw
history blame
No virus
1.81 kB
metadata
license: apache-2.0
language:
  - en

ACCORD-NLP

ACCORD-NLP is a Natural Language Processing (NLP) framework developed by the ACCORD project to facilitate Automated Compliance Checking (ACC) within the Architecture, Engineering, and Construction (AEC) sector. It consists of several pre-trained/fine-tuned machine learning models to perform the following information extraction tasks from regulatory text.

  1. Entity Extraction/Classification (ner)
  2. Relation Extraction/Classification (re)

roberta-large-lm is a domain-specific RoBERTa large model/RoBERTa large model pre-trained on a building regulatory text corpus using the Masked Language Modelling (MLM) objective. This needs to be fine-tuned for a downstream task such as entity or relation classification.

Installation

From Source

git clone https://github.com/Accord-Project/accord-nlp.git
cd accord-nlp
pip install -r requirements.txt

From pip

pip install accord-nlp

Using Pre-trained Models

Entity Extraction/Classification (ner)

from accord_nlp.text_classification.ner.ner_model import NERModel

model = NERModel('roberta', 'ACCORD-NLP/ner-roberta-large')
predictions, raw_outputs = model.predict(['The gradient of the passageway should not exceed five per cent.'])
print(predictions)

Relation Extraction/Classification (re)

from accord_nlp.text_classification.relation_extraction.re_model import REModel

model = REModel('roberta', 'ACCORD-NLP/re-roberta-large')
predictions, raw_outputs = model.predict(['The <e1>gradient<\e1> of the passageway should not exceed <e2>five per cent</e2>.'])
print(predictions)

For more details, please refer to the ACCORD-NLP GitHub repository.