Afreen's picture
license: apache-2.0
  - en
  - Token Classification
  - text: >-
      Monitored Natural Attenuation and, if necessary as a contingency, In Situ
      Chemical Oxidation to address the injection of a strong chemical oxidant
      to chemically treat the before the contingency can be implemented at the
      spill site.
    example_title: example 1
  - text: >-
      Site was identified as a potential source of groundwater contamination
      after the City performed Assessments were investigated further for
      potential contamination.
    example_title: example 2
  - text: >-
      Chromium releases from the UST is probably a major contributor to
      groundwater contamination in this area.
    example_title: example 3

About the Model

An Environmental Named Entity Recognition model, trained on dataset from USEPA to recognize environmental due diligence (7 entities) from a given text corpus (remediation reports, record of decision, 5 year record etc). This model was built on top of distilbert-base-uncased


The easiest way is to load the inference api from huggingface and second method is through the pipeline object offered by transformers library.

# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("token-classification", model="d4data/EnviDueDiligence_NER")

# Load model directly
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("d4data/EnviDueDiligence_NER")
model = AutoModelForTokenClassification.from_pretrained("d4data/EnviDueDiligence_NER")


This model is part of the Research topic "Environmental Due Diligence" conducted by Deepak John Reji, Afreen Aman. If you use this work (code, model or dataset), please cite:

Aman, A. and Reji, D.J., 2022. EnvBert: An NLP model for Environmental Due Diligence data classification. Software Impacts, 14, p.100427.

You can support me here :)

Buy Me A Coffee