Spaces:
Sleeping
Sleeping
File size: 1,566 Bytes
99e084c fb4a3c6 99e084c fb4a3c6 99e084c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
license: mit
title: Text Summarization
sdk: streamlit
emoji: 🔥
colorFrom: blue
colorTo: purple
---
# Summarization
This project is a machine learning pipeline for natural language processing tasks. It contains a set of scripts and modules that allow you to train and evaluate various models on your own data.
## Description
This repository contains a sample code with aim to demonstrate how to train a model for text summarization. The main focus is to show a basic template on how to create a structure from which we can smoothly deploy the model as well as perform inference on the trained model.
## Framework used:
* PyTorch
* Transformers
## Project Structure
* `pipeline`
This directory contains the code for the main data pipeline.
- `training_pipeline.py`: Code for the training pipeline.
- `inference_pipeline.py`: Code for the inference pipeline.
* `steps`
This directory includes various steps involved in the data pipeline.
- `evaluation.py`: Code for evaluating the model.
- `ingest_data.py`: Code for ingesting data into the pipeline.
- `preprocess.py`: Data preprocessing code.
- `model_train.py`: Model training code.
* `utils`
This directory contains utility functions used throughout the project.
- `utils.py`: General utility functions.
* `run_pipeline.py`
This script is the entry point for running the entire data pipeline.
* `Dockerfile`
The Dockerfile for creating a Docker image for this project.
* `requirements.txt`
## License
This project is licensed under the MIT License - see the LICENSE file for details. |