|
project: Multitask Learning for Agent-Action Identification |
|
|
|
Project Overview |
|
This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text. |
|
Project Structure |
|
The project is organized into the following directories and files: |
|
dataset/: contains the custom dataset class for loading and processing the text data |
|
dataset.py: defines the dataset class |
|
data_collator.py: defines the data collator class |
|
model/: contains the multitask learning model architecture |
|
model.py: defines the model architecture |
|
training/: contains the training loop and evaluation code |
|
main.py: contains the training loop and evaluation code |
|
data/: contains the dataset files for training, validation, and testing |
|
train.csv: training dataset |
|
val.csv: validation dataset |
|
test.csv: testing dataset |
|
requirements.txt: lists the dependencies required to run the project |
|
Dataset |
|
The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets. |
|
Training Set: 80% of the dataset (10,000 examples) |
|
Validation Set: 10% of the dataset (1,250 examples) |
|
Testing Set: 10% of the dataset (1,250 examples) |
|
Model |
|
The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously. |
|
Model Architecture: |
|
BERT encoder |
|
Two classification heads for agents and actions |
|
Model Parameters: |
|
BERT encoder: 110M parameters |
|
Classification heads: 10M parameters |
|
Training |
|
The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py. |
|
Training Hyperparameters: |
|
Batch size: 16 |
|
Number of epochs: 3 |
|
Learning rate: 1e-5 |
|
Training Time: approximately 10 hours on a single NVIDIA V100 GPU |
|
Evaluation |
|
The model is evaluated on the validation set during training. The evaluation metric is accuracy. |
|
Evaluation Metric: accuracy |
|
Evaluation Frequency: every 500 steps |
|
Requirements |
|
The project requires the following dependencies: |
|
Python: 3.8+ |
|
Transformers: 4.20.1+ |
|
Torch: 1.12.0+ |
|
Pandas: 1.4.2+ |
|
Usage |
|
To train the model, run the following command: |
|
Bash |
|
python main.py |
|
To evaluate the model, run the following command: |
|
Bash |
|
python main.py --mode eval |
|
License |
|
This project is licensed under the MIT License. |
|
Acknowledgments |
|
This project was inspired by the work of [Dennis Duncan]. |
|
Contributing |
|
Contributions are welcome! Please open an issue or submit a pull request to contribute to the project. |
|
|