File size: 2,216 Bytes
c8a5bca
8baa81a
c8a5bca
8baa81a
 
 
 
 
 
c8a5bca
8baa81a
 
 
 
 
 
 
861a754
8baa81a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
---
language: en
license: apache-2.0
datasets:
- ESGBERT/action_500
tags:
- ESG
- environmental
- action
---

# Model Card for EnvironmentalBERT-action

## Model Description

As an extension to [this paper](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4622514), this is the EnvironmentalBERT-action language model. A language model that is trained to better classify action texts in the ESG domain.

Using the [EnvironmentalBERT-base](https://huggingface.co/ESGBERT/EnvironmentalBERT-base) model as a starting point, the EnvironmentalBERT-action Language Model is additionally fine-trained on a 500 environmental dataset to detect action text samples. The underlying dataset is comparatively small, so if you would like to contribute to it, feel free to reach out. For instance, you could find a set of misclassifications and send it to me. :)

## How to Get Started With the Model
You can use the model with a pipeline for text classification:

```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
 
tokenizer_name = "ESGBERT/EnvironmentalBERT-action"
model_name = "ESGBERT/EnvironmentalBERT-action"
 
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(tokenizer_name, max_len=512)
 
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer) # set device=0 to use GPU
 
# See https://huggingface.co/docs/transformers/main_classes/pipelines#transformers.pipeline
print(pipe("We are actively working to reduce our CO2 emissions by planting trees in 25 countries.", padding=True, truncation=True))
```

## More details to the base models can be found in this paper

While this dataset does not originate from the paper, it is a extension of it and the base models are described in it.

```bibtex
@article{Schimanski23ESGBERT,
    title={{Bridiging the Gap in ESG Measurement: Using NLP to Quantify Environmental, Social, and Governance Communication}},
    author={Tobias Schimanski and Andrin Reding and Nico Reding and Julia Bingler and Mathias Kraus and Markus Leippold},
    year={2023},
    journal={Available on SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4622514},
}
```