File size: 3,173 Bytes
741647d
deacd7f
 
 
 
af510f1
741647d
 
 
 
452b538
ee5ec15
 
 
 
 
 
 
cd16d6d
ee5ec15
 
796df4b
 
094e6fb
796df4b
 
bceed7b
 
 
 
 
 
 
 
7b293fd
 
 
 
 
 
7a7137a
 
 
 
6c7dd33
 
 
2664f76
6c7dd33
 
 
 
c7647e7
 
 
ab3a79c
c7647e7
9b7ddc2
c7647e7
 
26e1e95
aa6f428
c7647e7
 
 
 
 
f71daf8
43b55c1
741647d
43b55c1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
title: T5-Summarisation
emoji: 
colorFrom: yellow
colorTo: red
sdk: streamlit
app_file: src/visualization/visualize.py
pinned: false
---

summarization
==============================

T5 Summarisation Using Pytorch Lightning

Instructions
------------
1. Clone the repo.
1. Edit the `params.yml` to change the parameters to train the model.
1. Run `make dirs` to create the missing parts of the directory structure described below. 
1. *Optional:* Run `make virtualenv` to create a python virtual environment. Skip if using conda or some other env manager.
    1. Run `source env/bin/activate` to activate the virtualenv. 
1. Run `make requirements` to install required python packages.
1. Process your data, train and evaluate your model using `make run`
1. When you're happy with the result, commit files (including .dvc files) to git.
 
Project Organization
------------

    ├── LICENSE
    ├── Makefile           <- Makefile with commands like `make dirs` or `make clean`
    ├── README.md          <- The top-level README for developers using this project.
    ├── data
    │   ├── processed      <- The final, canonical data sets for modeling.
    │   └── raw            <- The original, immutable data dump.

    ├── models             <- Trained and serialized models, model predictions, or model summaries

    ├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
    │                         the creator's initials, and a short `-` delimited description, e.g.
    │                         `1.0-jqp-initial-data-exploration`.
    ├── references         <- Data dictionaries, manuals, and all other explanatory materials.

    ├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
    │   └── metrics.txt    <- Relevant metrics after evaluating the model.
    │   └── training_metrics.txt    <- Relevant metrics from training the model.

    ├── requirements.txt   <- The requirements file for reproducing the analysis environment

    ├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
    ├── src                <- Source code for use in this project.
    │   ├── __init__.py    <- Makes src a Python module
    │   │
    │   ├── data           <- Scripts to download or generate data
    │   │   └── make_dataset.py
    │   │   └── process_data.py
    │   │
    │   ├── models         <- Scripts to train models 
    │   │   ├── predict_model.py
    │   │   └── train_model.py
    │   │   └── evaluate_model.py
    │   │   └── model.py
    │   │
    │   └── visualization  <- Scripts to create exploratory and results oriented visualizations
    │       └── visualize.py

    ├── tox.ini            <- tox file with settings for running tox; see tox.testrun.org
    └── data.dvc          <- Traing a model on the processed data.


--------