|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
|
|
# ACCORD-NLP |
|
|
|
ACCORD-NLP is a Natural Language Processing (NLP) framework developed by the [ACCORD](https://accordproject.eu/) project to facilitate Automated Compliance Checking (ACC) within the Architecture, Engineering, and Construction (AEC) sector. |
|
It consists of several pre-trained/fine-tuned machine learning models to perform the following information extraction tasks from regulatory text. |
|
1. Entity Extraction/Classification (ner) |
|
2. Relation Extraction/Classification (re) |
|
|
|
**roberta-large-lm** is a domain-specific RoBERTa large model/RoBERTa large model pre-trained on a building regulatory text corpus using the Masked Language Modelling (MLM) objective. |
|
This needs to be fine-tuned for a downstream task such as entity or relation classification. |
|
|
|
## Installation |
|
|
|
### From Source |
|
``` |
|
git clone https://github.com/Accord-Project/accord-nlp.git |
|
cd accord-nlp |
|
pip install -r requirements.txt |
|
``` |
|
|
|
### From pip |
|
``` |
|
pip install accord-nlp |
|
``` |
|
|
|
## Using Pre-trained Models |
|
|
|
### Entity Extraction/Classification (ner) |
|
|
|
```python |
|
from accord_nlp.text_classification.ner.ner_model import NERModel |
|
|
|
model = NERModel('roberta', 'ACCORD-NLP/ner-roberta-large') |
|
predictions, raw_outputs = model.predict(['The gradient of the passageway should not exceed five per cent.']) |
|
print(predictions) |
|
``` |
|
|
|
### Relation Extraction/Classification (re) |
|
|
|
```python |
|
from accord_nlp.text_classification.relation_extraction.re_model import REModel |
|
|
|
model = REModel('roberta', 'ACCORD-NLP/re-roberta-large') |
|
predictions, raw_outputs = model.predict(['The <e1>gradient<\e1> of the passageway should not exceed <e2>five per cent</e2>.']) |
|
print(predictions) |
|
``` |
|
|
|
For more details, please refer to the [ACCORD-NLP](https://github.com/Accord-Project/accord-nlp) GitHub repository. |