File size: 4,326 Bytes
7e3e85d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
Metadata-Version: 2.1
Name: SummerTime
Version: 0.1
Summary: A summarization mode
Home-page: https://github.com/LILYlab
Author: Ansong Ni, Murori Mutuma, Zhangir Azerbayev, Yusen Zhang, Tao Yu, Dragomir Radev
Author-email: ansong.ni@yale.edu, murorimutuma@gmail.com, zhangir.azerbayev@yale.edu
License: UNKNOWN
Description: # SummerTime
        
        A library to help users choose appropriate summarization tools based on their specific tasks or needs. Includes models, evaluation metrics, and datasets.
        
        
        
        ## Installation and setup
        
        #### Create and activate a new `conda` environment:
        ```bash
        conda create -n st python=3.7
        conda activate st
        ```
        
        #### `pip` dependencies for local demo:
        ```bash
        pip install -r requirements.txt
        ```
        
        
        
        ## Quick Start
        Imports model, initializes default model, and summarizes sample documents.
        ```python
        import model as st_model
        
        model = st_model.summarizer()
        documents = [
            """ PG&E stated it scheduled the blackouts in response to forecasts for high winds amid dry conditions. 
            The aim is to reduce the risk of wildfires. Nearly 800 thousand customers were scheduled to be affected 
            by the shutoffs which were expected to last through at least midday tomorrow."""
        ]
        model.summarize(documents)
        
        # ["California's largest electricity provider has turned off power to hundreds of thousands of customers."]
        ```
        
        Also, please run `demo.ipynb` demo Jupyter notebook for more examples. To start demo Jupyter notebook on localhost:
        ```bash
        jupyter notebook demo.ipynb
        ```
        
        
        
        ## Models
        Import and initialization:
        ```python
        import model as st_model
        
        default_model = std_model.summarizer()
        bart_model = std_model.bart_model.BartModel()
        pegasus_model = std_model.pegasus_model.PegasusModel()
        lexrank_model = std_model.lexrank_model.LexRankModel()
        textrank_model = st_model.textrank_model.TextRankModel()
        ```
        
        All models can be initialized with the following optional options:
        ```python
        def __init__(self,
                 trained_domain: str=None,
                 max_input_length: int=None,
                 max_output_length: int=None,
                 ):
        ```
        
        All models implement the following methods:
        ```python
        def summarize(self,
          corpus: Union[List[str], List[List[str]]],
          queries: List[str]=None) -> List[str]:
        
        def show_capability(cls) -> None:
        
        def generate_basic_description(cls) -> str:
        ```
        
        
        
        ## Evaluation
        Import and initialization:
        ```python
        import eval as st_eval
        
        bert_eval = st_eval.bertscore()
        bleu_eval = st_eval.bleu_eval()
        rouge_eval = st_eval.rouge()
        rougewe_eval = st_eval.rougewe()
        ```
        
        All evaluation metrics can be initialized with the following optional arguments:
        ```python
        def __init__(self, metric_name):
        ```
        
        All evaluation metric objects implement the following methods:
        ```python
        def evaluate(self, model, data):
        
        def get_dict(self, keys):
        ```
        
        
        ## Datasets
        Import and initialization:
        ```python
        import dataset.stdatasets as st_data
        ```
        
        ## Contributors
        This repository is built by the [LILY Lab](https://yale-lily.github.io/) at Yale University, led by Prof. [Dragomir Radev](https://cpsc.yale.edu/people/dragomir-radev). The main contributors are [Ansong Ni](https://niansong1996.github.io), Zhangir Azerbayev, Troy Feng, Murori Mutuma and Yusen Zhang (Penn State). For comments and question, please open an issue.
        
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown