File size: 4,887 Bytes
d762cd6
 
 
2d63bb6
2c5fe63
2d63bb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
beac7b9
2d63bb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9486ba5
 
 
 
2d63bb6
 
 
 
 
 
759b03f
2d63bb6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103c4e6
2d63bb6
 
 
 
 
 
 
 
 
9486ba5
 
2d63bb6
 
 
 
 
 
103c4e6
2d63bb6
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
---
license: mit
---

# Quick Summary

<!-- Provide a quick summary of what the model is/does. -->

IEQ-BERT classifies building occupant feedback concerning indoor environmental quality.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

The IEQ-BERT model is a fine-tuned variant of the BERT (Bidirectional Encoder Representations from Transformers) architecture, adapted for the task of multilabel text classification in the context of Indoor Environmental Quality (IEQ). IEQ refers to the physical characteristics of indoor spaces, such as thermal comfort, acoustic comfort, visual comfort, and indoor air quality (IAQ), which directly impact occupant well-being, productivity, and satisfaction. The IEQ-BERT model is designed to analyze and classify occupant feedback into one or more of the following categories: "Acoustic," "IAQ," "Thermal," "Visual," and "No IEQ." The "No IEQ" category is reserved for feedback that uses language resembling the IEQ domain but does not pertain to indoor environmental quality, ensuring the model can distinguish between relevant and irrelevant content.

- **Developed by:** Researchers at Deakin Unievrsity (Australia) and Northwestern University (US)
- **Funded by:** Deakin University, School of Architecture and Built Environment
- **Model type:** Multilable Text Classification
- **Language:** English 
- **Finetuned from model:** bert-base-uncased

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** This model repository
- **Paper:** Sadick, A.-M., & Chinazzo, G. (2025). What did the occupant say? Fine-tuning and evaluating a language model for efficient analysis of multi-domain indoor environmental quality feedback. Building and Environment, 112735. https://doi.org/10.1016/j.buildenv.2025.112735
- **Demo:** https://ieq-ieq-text-classifier-app.hf.space 

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use

<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

This model has a wide range of potential use cases, including:

- **Building Design and Architecture**: Analyzing feedback to identify recurring issues related to thermal comfort, lighting, or acoustics, which can inform design improvements to enhance occupant satisfaction.
- **Building Management and Facility Planning**: Monitoring feedback in real-time to address specific IEQ concerns, such as HVAC performance or lighting issues, and prioritize interventions.
- **Post-Occupancy Evaluation (POE)**: Classifying open-ended feedback from occupant surveys to assess the effectiveness of building designs and operational strategies.
- **Integration into Building Automation Systems**: Processing occupant feedback alongside sensor data to provide actionable insights for optimizing indoor environments.


### Out-of-Scope Use

<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->

Please use this model for the intended purposes stated above.


## How to Get Started with the Model

Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("ieq/IEQ-BERT")
model = AutoModelForSequenceClassification.from_pretrained("ieq/IEQ-BERT")
```

## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The training data consists of 14,622 filtered texts from Glassdoor job reviews and X posts about work environments during the COVID-19 pandemic. Five labellers manually labeled each feedback item using Labelbox to ensure accuracy, and they further checked for consistency using Cleanlab Studio. 


## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

#### Metrics

<!-- These are the evaluation metrics being used, ideally with a description of why. -->
- **Accuracy**: 0.93
- **F1**: 0.93


## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

If you use this model, please cite the journal article below:

**APA:** Sadick, A.-M., & Chinazzo, G. (2025). What did the occupant say? Fine-tuning and evaluating a large language model for efficient analysis of multi-domain indoor environmental quality feedback. Building and Environment, 112735. https://doi.org/10.1016/j.buildenv.2025.112735


## Model Card Contact

Dr Abdul-Manan Sadick - s.sadick@deakin.edu.au

Dr Giorgia Chinazzo - giorgia.chinazzo@northwestern.edu