Update README.md
Browse files
README.md
CHANGED
@@ -29,13 +29,15 @@ I'm interested in using encoder-based extraction of named legal document section
|
|
29 |
- ROBERTa base output shape (batch size, seq length, hidden size)
|
30 |
- ROBERTa base hidden size = 768
|
31 |
- ROBERTa base max input seq length = 512
|
32 |
-
|
|
|
|
|
33 |
|
34 |
[cls] A [sep] B
|
35 |
|
36 |
and the embedding for [cls] can be used in a binary classifier.
|
37 |
|
38 |
-
|
39 |
|
40 |
1. standard ROBERTA model
|
41 |
2. classification of [CLS] token embedding:
|
|
|
29 |
- ROBERTa base output shape (batch size, seq length, hidden size)
|
30 |
- ROBERTa base hidden size = 768
|
31 |
- ROBERTa base max input seq length = 512
|
32 |
+
|
33 |
+
|
34 |
+
### Using ROBERTA for segmentation involves combining sentences A and B into single input to ROBERTA. See below:
|
35 |
|
36 |
[cls] A [sep] B
|
37 |
|
38 |
and the embedding for [cls] can be used in a binary classifier.
|
39 |
|
40 |
+
### But, the architecture used here (implemented via ) is:
|
41 |
|
42 |
1. standard ROBERTA model
|
43 |
2. classification of [CLS] token embedding:
|