github: linkedin: # Summarization This project is a machine learning pipeline for natural language processing tasks. It contains a set of scripts and modules that allow you to train and evaluate various models on your own data. ## Description This repository contains a sample code with aim to demonstrate how to train a model for text summarization. The main focus is to show a basic template on how to create a structure from which we can smoothly deploy the model as well as perform inference on the trained model. ## Framework used: * PyTorch * Transformers ## Project Structure * `pipeline` This directory contains the code for the main data pipeline. - ``: Code for the training pipeline. - ``: Code for the inference pipeline. * `steps` This directory includes various steps involved in the data pipeline. - ``: Code for evaluating the model. - ``: Code for ingesting data into the pipeline. - ``: Data preprocessing code. - ``: Model training code. * `utils` This directory contains utility functions used throughout the project. - ``: General utility functions. * `` This script is the entry point for running the entire data pipeline. * `Dockerfile` The Dockerfile for creating a Docker image for this project. * `requirements.txt` List of Python packages required for running the project. Install them using: ## Demo I have already trained a t5-base model and uploaded it into HuggingFace. The streamlit demo can be accessed from following link. ## License This project is licensed under the MIT License - see the LICENSE file for details.