sankalp151 commited on
Commit
8850784
1 Parent(s): 8305669

Code for "How to get started" added

Browse files
Files changed (1) hide show
  1. README.md +159 -0
README.md CHANGED
@@ -1,3 +1,162 @@
1
  ---
2
  license: mit
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ pipeline_tag: text2text-generation
6
+ tags:
7
+ - legal
8
  ---
9
+ # Model Card for Model ID
10
+
11
+ <!-- Provide a quick summary of what the model is/does. -->
12
+ This model is useful in the pipeline of complex information extraction. The model will generate discourse trees from complex sentences.
13
+ Discourse trees contain simple split sentences and relationship between these sentences.
14
+
15
+ ## Model Details
16
+
17
+ ### Model Description
18
+
19
+ <!-- Provide a longer summary of what this model is. -->
20
+ This model is useful in the pipeline of complex information extraction. The model will generate discourse trees from complex sentences.
21
+ Discourse trees contain simple split sentences and relationship between these sentences.
22
+
23
+
24
+ - **Developed by:** BITS Hyderabad
25
+ <!-- - **Funded by [optional]:** [More Information Needed] -->
26
+ <!-- - **Shared by [optional]:** [More Information Needed] -->
27
+ - **Model type:** Language model
28
+ - **Language(s) (NLP):** English
29
+ <!-- - **License:** [More Information Needed] -->
30
+ - **Finetuned from model [optional]:** [flan-t5-base](https://huggingface.co/google/flan-t5-base)
31
+
32
+
33
+ ## Uses
34
+
35
+
36
+
37
+ ### Direct Use
38
+
39
+ <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
40
+ Model is finetuned and can directly be used.
41
+
42
+ [More Information Needed]
43
+
44
+
45
+ ### Recommendations
46
+
47
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
48
+
49
+
50
+ ## How to Get Started with the Model
51
+
52
+ Use the code below to get started with the model.
53
+
54
+ ```
55
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
56
+ import spacy
57
+
58
+ tokenizer = AutoTokenizer.from_pretrained("bphclegalie/t5-base-legen", token = True)
59
+ model = AutoModelForSeq2SeqLM.from_pretrained("bphclegalie/t5-base-legen", token = True)
60
+
61
+ nlp = spacy.load("en_core_web_sm")
62
+
63
+
64
+ def get_discourse_tree(text):
65
+ sentences = " ".join([t.text for t in nlp(text)])
66
+
67
+ input_ids = tokenizer(text, max_length=384, truncation=True, return_tensors="pt").input_ids
68
+ outputs = model.generate(input_ids=input_ids, max_length=128)
69
+
70
+ answer = [tokenizer.decode(output, skip_special_tokens = True) for output in outputs]
71
+ return " ".join(answer)
72
+
73
+
74
+ ```
75
+
76
+ [More Information Needed]
77
+
78
+ ## Training Details
79
+
80
+ ### Training Data
81
+
82
+ <!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
83
+
84
+ [More Information Needed]
85
+
86
+ ### Training Procedure
87
+
88
+ <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
89
+
90
+ #### Preprocessing [optional]
91
+
92
+ [More Information Needed]
93
+
94
+
95
+ #### Training Hyperparameters
96
+
97
+ - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
98
+
99
+ #### Speeds, Sizes, Times [optional]
100
+
101
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
102
+
103
+ [More Information Needed]
104
+
105
+ ## Evaluation
106
+
107
+ <!-- This section describes the evaluation protocols and provides the results. -->
108
+
109
+ ### Testing Data, Factors & Metrics
110
+
111
+ #### Testing Data
112
+
113
+ <!-- This should link to a Dataset Card if possible. -->
114
+
115
+ [More Information Needed]
116
+
117
+ #### Factors
118
+
119
+ <!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
120
+
121
+ [More Information Needed]
122
+
123
+ #### Metrics
124
+
125
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
126
+
127
+ [More Information Needed]
128
+
129
+ ### Results
130
+
131
+ [More Information Needed]
132
+
133
+ #### Summary
134
+
135
+
136
+
137
+
138
+ ## Technical Specifications [optional]
139
+
140
+ ### Model Architecture and Objective
141
+
142
+ [More Information Needed]
143
+
144
+ ## Glossary [optional]
145
+
146
+ <!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
147
+
148
+ [More Information Needed]
149
+
150
+ ## More Information [optional]
151
+
152
+ [More Information Needed]
153
+
154
+ ## Model Card Authors [optional]
155
+
156
+ [More Information Needed]
157
+
158
+ ## Model Card Contact
159
+
160
+ [More Information Needed]
161
+
162
+