dnnsdunca
/

agentic-Transformer

Model card Files Files and versions Community

agentic-Transformer / readme.md

dnnsdunca's picture

Create readme.md

736ea32 verified 4 months ago

|

history blame contribute delete

2.66 kB

	project: Multitask Learning for Agent-Action Identification

	Project Overview
	This project aims to develop a multitask learning model for identifying agents and actions in text data. The model is trained on a custom dataset of text examples, where each example is annotated with the agents and actions present in the text.
	Project Structure
	The project is organized into the following directories and files:
	dataset/: contains the custom dataset class for loading and processing the text data
	dataset.py: defines the dataset class
	data_collator.py: defines the data collator class
	model/: contains the multitask learning model architecture
	model.py: defines the model architecture
	training/: contains the training loop and evaluation code
	main.py: contains the training loop and evaluation code
	data/: contains the dataset files for training, validation, and testing
	train.csv: training dataset
	val.csv: validation dataset
	test.csv: testing dataset
	requirements.txt: lists the dependencies required to run the project
	Dataset
	The dataset consists of text examples, where each example is annotated with the agents and actions present in the text. The dataset is split into training, validation, and testing sets.
	Training Set: 80% of the dataset (10,000 examples)
	Validation Set: 10% of the dataset (1,250 examples)
	Testing Set: 10% of the dataset (1,250 examples)
	Model
	The model is a multitask learning model based on the BERT architecture. The model is trained to predict both agents and actions simultaneously.
	Model Architecture:
	BERT encoder
	Two classification heads for agents and actions
	Model Parameters:
	BERT encoder: 110M parameters
	Classification heads: 10M parameters
	Training
	The model is trained using the Trainer class from the Hugging Face library. The training loop is defined in main.py.
	Training Hyperparameters:
	Batch size: 16
	Number of epochs: 3
	Learning rate: 1e-5
	Training Time: approximately 10 hours on a single NVIDIA V100 GPU
	Evaluation
	The model is evaluated on the validation set during training. The evaluation metric is accuracy.
	Evaluation Metric: accuracy
	Evaluation Frequency: every 500 steps
	Requirements
	The project requires the following dependencies:
	Python: 3.8+
	Transformers: 4.20.1+
	Torch: 1.12.0+
	Pandas: 1.4.2+
	Usage
	To train the model, run the following command:
	Bash
	python main.py
	To evaluate the model, run the following command:
	Bash
	python main.py --mode eval
	License
	This project is licensed under the MIT License.
	Acknowledgments
	This project was inspired by the work of [Dennis Duncan].
	Contributing
	Contributions are welcome! Please open an issue or submit a pull request to contribute to the project.