Create readme.md
Browse files
readme.md
ADDED
@@ -0,0 +1,61 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
project: Multitask Learning for Agent-Action Identification
|
2 |
+
|
3 |
+
Project Overview
|
4 |
+
This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text.
|
5 |
+
Project Structure
|
6 |
+
The project is organized into the following directories and files:
|
7 |
+
dataset/: contains the custom dataset class for loading and processing the text data
|
8 |
+
dataset.py: defines the dataset class
|
9 |
+
data_collator.py: defines the data collator class
|
10 |
+
model/: contains the multitask learning model architecture
|
11 |
+
model.py: defines the model architecture
|
12 |
+
training/: contains the training loop and evaluation code
|
13 |
+
main.py: contains the training loop and evaluation code
|
14 |
+
data/: contains the dataset files for training, validation, and testing
|
15 |
+
train.csv: training dataset
|
16 |
+
val.csv: validation dataset
|
17 |
+
test.csv: testing dataset
|
18 |
+
requirements.txt: lists the dependencies required to run the project
|
19 |
+
Dataset
|
20 |
+
The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets.
|
21 |
+
Training Set: 80% of the dataset (10,000 examples)
|
22 |
+
Validation Set: 10% of the dataset (1,250 examples)
|
23 |
+
Testing Set: 10% of the dataset (1,250 examples)
|
24 |
+
Model
|
25 |
+
The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously.
|
26 |
+
Model Architecture:
|
27 |
+
BERT encoder
|
28 |
+
Two classification heads for agents and actions
|
29 |
+
Model Parameters:
|
30 |
+
BERT encoder: 110M parameters
|
31 |
+
Classification heads: 10M parameters
|
32 |
+
Training
|
33 |
+
The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py.
|
34 |
+
Training Hyperparameters:
|
35 |
+
Batch size: 16
|
36 |
+
Number of epochs: 3
|
37 |
+
Learning rate: 1e-5
|
38 |
+
Training Time: approximately 10 hours on a single NVIDIA V100 GPU
|
39 |
+
Evaluation
|
40 |
+
The model is evaluated on the validation set during training. The evaluation metric is accuracy.
|
41 |
+
Evaluation Metric: accuracy
|
42 |
+
Evaluation Frequency: every 500 steps
|
43 |
+
Requirements
|
44 |
+
The project requires the following dependencies:
|
45 |
+
Python: 3.8+
|
46 |
+
Transformers: 4.20.1+
|
47 |
+
Torch: 1.12.0+
|
48 |
+
Pandas: 1.4.2+
|
49 |
+
Usage
|
50 |
+
To train the model, run the following command:
|
51 |
+
Bash
|
52 |
+
python main.py
|
53 |
+
To evaluate the model, run the following command:
|
54 |
+
Bash
|
55 |
+
python main.py --mode eval
|
56 |
+
License
|
57 |
+
This project is licensed under the MIT License.
|
58 |
+
Acknowledgments
|
59 |
+
This project was inspired by the work of [Dennis Duncan].
|
60 |
+
Contributing
|
61 |
+
Contributions are welcome! Please open an issue or submit a pull request to contribute to the project.
|