File size: 3,632 Bytes
dee848c
8ecadd9
8bcebcf
85e8ccf
 
 
 
 
926f2cd
8ecadd9
b796027
8ecadd9
6783b8c
8ecadd9
 
b796027
 
ea5dcc8
8ecadd9
ea5dcc8
8ecadd9
 
6783b8c
8ecadd9
 
 
 
 
 
 
 
 
 
a18ded0
8ecadd9
 
 
24df5eb
8ecadd9
 
b796027
 
 
 
8ecadd9
b796027
6783b8c
b796027
4d179ce
 
 
 
 
 
 
6783b8c
 
b0a71d3
9b174bd
 
6783b8c
9b174bd
 
 
6783b8c
53719a8
 
 
 
 
 
8ecadd9
ea5dcc8
 
 
 
 
 
 
 
4d179ce
85e8ccf
b796027
dee848c
85e8ccf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
import gradio as gr
from util import summarizer, textproc
from util.examples import entries
import importlib

importlib.reload(summarizer)
importlib.reload(textproc)

model_collection = summarizer.init_models()
patent_summarizer = summarizer.PatentSummarizer(model_collection)
iface = gr.Interface(
    patent_summarizer.pipeline,
    theme="huggingface",
    examples=entries,
    examples_per_page=4,
    inputs=[
        gr.inputs.Textbox(label="Patent Information", 
                          placeholder="US10125002B2 or https://patents.google.com/patent/US10125002B2 ...", 
                          lines=1,
                          default="US10125002B2"),
        gr.inputs.CheckboxGroup(summarizer.summary_options, 
                                default=summarizer.summary_options,
                                label="Summaries to Generate"),
        gr.inputs.Dropdown(summarizer.model_names,
                           default=summarizer.model_names[3],
                           label="Abstract Model"),
        gr.inputs.Dropdown(summarizer.model_names,
                           default=summarizer.model_names[0],
                           label="Background Model"),
        gr.inputs.Dropdown(summarizer.model_names,
                           default=summarizer.model_names[2],
                           label="Claims Model"),
        gr.inputs.Checkbox(default=True, 
                           label="Collate Claims", 
                           optional=False),
        gr.inputs.Slider(minimum=250,
                         maximum=1000,
                         step=10,
                         default=250,
                         label="Input Document Word Limit"),
    ],
    outputs=[
        gr.outputs.Textbox(label="Abstract Summary"),
        gr.outputs.Textbox(label="Background Summary"),
        gr.outputs.Textbox(label="Sample Claims Summary")
    ],
    title="Patent Summarizer πŸ“–",
    description="""
    ✏️ Provides an interface for user input
    πŸ“‚ Retrieves and parses the document from Patents Google
    πŸ“‘ Returns summaries for the abstract, background, and/or claims.
    
    Check the end of the app for more details.
    """,
    article="""
    v.1.0.0

    Reading through patent documents is oftentimes a long and tedious task.
    There are cases wherein one has to manually go through several pages in 
    order to determine if the patent is relevant to the prior art search.
    
    This application is meant to automate the initial phase of going through 
    patents so that the potential inventor or researcher may lessen the time 
    spent trying to filter documents.
    
    Notes:
    - Increasing 'Input Document Word Limit' may improve results but will
    cause inference time to increase
    - Setting 'Collate Claims' to False will apply summarization across
    individual claims. Doing so will greatly increase inference time.

    Models explored:
    πŸ€– Big Bird: Transformers for Longer Sequences
    https://arxiv.org/abs/2007.14062 , https://huggingface.co/google/bigbird-pegasus-large-bigpatent
    πŸ€– T5
    https://arxiv.org/pdf/1910.10683.pdf , https://huggingface.co/cnicu/t5-small-booksum
    πŸ€– BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension 
    https://arxiv.org/abs/1910.13461, https://huggingface.co/sshleifer/distilbart-cnn-6-6
    πŸ€– PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
    https://arxiv.org/abs/1912.08777 , https://huggingface.co/google/pegasus-xsum
    """
    
)

iface.launch(enable_queue=True)