dnnsdunca
/

agentic-Transformer

Model card Files Files and versions Community

dnnsdunca commited on Jul 31

Commit

736ea32

•

1 Parent(s): a117e96

Create readme.md

Files changed (1) hide show

readme.md +61 -0

readme.md ADDED Viewed

	@@ -0,0 +1,61 @@

+project: Multitask Learning for Agent-Action Identification
+Project Overview
+This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text.
+Project Structure
+The project is organized into the following directories and files:
+dataset/: contains the custom dataset class for loading and processing the text data
+dataset.py: defines the dataset class
+data_collator.py: defines the data collator class
+model/: contains the multitask learning model architecture
+model.py: defines the model architecture
+training/: contains the training loop and evaluation code
+main.py: contains the training loop and evaluation code
+data/: contains the dataset files for training, validation, and testing
+train.csv: training dataset
+val.csv: validation dataset
+test.csv: testing dataset
+requirements.txt: lists the dependencies required to run the project
+Dataset
+The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets.
+Training Set: 80% of the dataset (10,000 examples)
+Validation Set: 10% of the dataset (1,250 examples)
+Testing Set: 10% of the dataset (1,250 examples)
+Model
+The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously.
+Model Architecture:
+BERT encoder
+Two classification heads for agents and actions
+Model Parameters:
+BERT encoder: 110M parameters
+Classification heads: 10M parameters
+Training
+The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py.
+Training Hyperparameters:
+Batch size: 16
+Number of epochs: 3
+Learning rate: 1e-5
+Training Time: approximately 10 hours on a single NVIDIA V100 GPU
+Evaluation
+The model is evaluated on the validation set during training. The evaluation metric is accuracy.
+Evaluation Metric: accuracy
+Evaluation Frequency: every 500 steps
+Requirements
+The project requires the following dependencies:
+Python: 3.8+
+Transformers: 4.20.1+
+Torch: 1.12.0+
+Pandas: 1.4.2+
+Usage
+To train the model, run the following command:
+Bash
+python main.py
+To evaluate the model, run the following command:
+Bash
+python main.py --mode eval
+License
+This project is licensed under the MIT License.
+Acknowledgments
+This project was inspired by the work of [Dennis Duncan].
+Contributing
+Contributions are welcome! Please open an issue or submit a pull request to contribute to the project.