agentic-Transformer / readme.md
dnnsdunca's picture
Create readme.md
736ea32 verified
|
raw
history blame
2.66 kB
project: Multitask Learning for Agent-Action Identification
Project Overview
This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text.
Project Structure
The project is organized into the following directories and files:
dataset/: contains the custom dataset class for loading and processing the text data
dataset.py: defines the dataset class
data_collator.py: defines the data collator class
model/: contains the multitask learning model architecture
model.py: defines the model architecture
training/: contains the training loop and evaluation code
main.py: contains the training loop and evaluation code
data/: contains the dataset files for training, validation, and testing
train.csv: training dataset
val.csv: validation dataset
test.csv: testing dataset
requirements.txt: lists the dependencies required to run the project
Dataset
The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets.
Training Set: 80% of the dataset (10,000 examples)
Validation Set: 10% of the dataset (1,250 examples)
Testing Set: 10% of the dataset (1,250 examples)
Model
The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously.
Model Architecture:
BERT encoder
Two classification heads for agents and actions
Model Parameters:
BERT encoder: 110M parameters
Classification heads: 10M parameters
Training
The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py.
Training Hyperparameters:
Batch size: 16
Number of epochs: 3
Learning rate: 1e-5
Training Time: approximately 10 hours on a single NVIDIA V100 GPU
Evaluation
The model is evaluated on the validation set during training. The evaluation metric is accuracy.
Evaluation Metric: accuracy
Evaluation Frequency: every 500 steps
Requirements
The project requires the following dependencies:
Python: 3.8+
Transformers: 4.20.1+
Torch: 1.12.0+
Pandas: 1.4.2+
Usage
To train the model, run the following command:
Bash
python main.py
To evaluate the model, run the following command:
Bash
python main.py --mode eval
License
This project is licensed under the MIT License.
Acknowledgments
This project was inspired by the work of [Dennis Duncan].
Contributing
Contributions are welcome! Please open an issue or submit a pull request to contribute to the project.