Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,8 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
library_name: paddlenlp
|
|
|
|
|
4 |
---
|
5 |
|
6 |
[![paddlenlp-banner](https://user-images.githubusercontent.com/1371212/175816733-8ec25eb0-9af3-4380-9218-27c154518258.png)](https://github.com/PaddlePaddle/PaddleNLP)
|
@@ -13,6 +15,8 @@ UIE Paper: https://arxiv.org/abs/2203.12277
|
|
13 |
|
14 |
PaddleNLP released UIE model series for Information Extraction of texts and multi-modal documents which use the ERNIE 3.0 models as the pre-trained language models and were finetuned on a large amount of information extraction data.
|
15 |
|
|
|
|
|
16 |
## Available Models
|
17 |
|
18 |
| Model Name | Usage Scenarios | Supporting Tasks |
|
@@ -30,28 +34,16 @@ We conducted experiments on the in-house test sets of the three different domain
|
|
30 |
<table>
|
31 |
<tr><th row_span='2'><th colspan='2'>finance<th colspan='2'>healthcare<th colspan='2'>internet
|
32 |
<tr><td><th>0-shot<th>5-shot<th>0-shot<th>5-shot<th>0-shot<th>5-shot
|
33 |
-
<tr><td
|
34 |
<tr><td>uie-medium (6L768H)<td>41.11<td>64.53<td>65.40<td>75.72<td>78.32<td>79.68
|
35 |
<tr><td>uie-mini (6L384H)<td>37.04<td>64.65<td>60.50<td>78.36<td>72.09<td>76.38
|
36 |
<tr><td>uie-micro (4L384H)<td>37.53<td>62.11<td>57.04<td>75.92<td>66.00<td>70.22
|
37 |
<tr><td>uie-nano (4L312H)<td>38.94<td>66.83<td>48.29<td>76.74<td>62.86<td>72.35
|
38 |
<tr><td>uie-m-large (24L1024H)<td><b>49.35</b><td><b>74.55</b><td>70.50<td><b>92.66</b ><td>78.49<td><b>83.02</b>
|
39 |
<tr><td>uie-m-base (12L768H)<td>38.46<td>74.31<td>63.37<td>87.32<td>76.27<td>80.13
|
40 |
-
<tr><td>uie-x-base (12L768H)
|
41 |
</table>
|
42 |
|
43 |
0-shot means that no training data is directly used for prediction through paddlenlp.Taskflow, and 5-shot means that each category contains 5 pieces of labeled data for model fine-tuning. Experiments show that UIE can further improve the performance with a small amount of data (few-shot).
|
44 |
|
45 |
-
|
46 |
-
|
47 |
-
We experimented on the zero-shot performance of UIE-X on the in-house multi-modal test sets in three different domains of general, financial, and medical:
|
48 |
-
|
49 |
-
<table>
|
50 |
-
<tr><th ><th>General <th>Financial<th colspan='2'>Medical
|
51 |
-
<tr><td>🧾🎓<b>uie-x-base (12L768H)</b><td>65.03<td>73.51<td>84.24
|
52 |
-
</table>
|
53 |
-
|
54 |
-
The general test set contains complex samples from different fields and is the most difficult task.
|
55 |
-
|
56 |
-
> Detailed Info: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/applications/information_extraction/README_en.md
|
57 |
-
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
library_name: paddlenlp
|
4 |
+
language:
|
5 |
+
- zh
|
6 |
---
|
7 |
|
8 |
[![paddlenlp-banner](https://user-images.githubusercontent.com/1371212/175816733-8ec25eb0-9af3-4380-9218-27c154518258.png)](https://github.com/PaddlePaddle/PaddleNLP)
|
|
|
15 |
|
16 |
PaddleNLP released UIE model series for Information Extraction of texts and multi-modal documents which use the ERNIE 3.0 models as the pre-trained language models and were finetuned on a large amount of information extraction data.
|
17 |
|
18 |
+
![UIE-diagram](https://user-images.githubusercontent.com/40840292/167236006-66ed845d-21b8-4647-908b-e1c6e7613eb1.png)
|
19 |
+
|
20 |
## Available Models
|
21 |
|
22 |
| Model Name | Usage Scenarios | Supporting Tasks |
|
|
|
34 |
<table>
|
35 |
<tr><th row_span='2'><th colspan='2'>finance<th colspan='2'>healthcare<th colspan='2'>internet
|
36 |
<tr><td><th>0-shot<th>5-shot<th>0-shot<th>5-shot<th>0-shot<th>5-shot
|
37 |
+
<tr><td>uie-base (12L768H)<td>46.43<td>70.92<td><b>71.83</b><td>85.72<td>78.33<td>81.86
|
38 |
<tr><td>uie-medium (6L768H)<td>41.11<td>64.53<td>65.40<td>75.72<td>78.32<td>79.68
|
39 |
<tr><td>uie-mini (6L384H)<td>37.04<td>64.65<td>60.50<td>78.36<td>72.09<td>76.38
|
40 |
<tr><td>uie-micro (4L384H)<td>37.53<td>62.11<td>57.04<td>75.92<td>66.00<td>70.22
|
41 |
<tr><td>uie-nano (4L312H)<td>38.94<td>66.83<td>48.29<td>76.74<td>62.86<td>72.35
|
42 |
<tr><td>uie-m-large (24L1024H)<td><b>49.35</b><td><b>74.55</b><td>70.50<td><b>92.66</b ><td>78.49<td><b>83.02</b>
|
43 |
<tr><td>uie-m-base (12L768H)<td>38.46<td>74.31<td>63.37<td>87.32<td>76.27<td>80.13
|
44 |
+
<tr><td>🧾🎓<b>uie-x-base (12L768H)</b><td>48.84<td>73.87<td>65.60<td>88.81<td><b>79.36</b> <td>81.65
|
45 |
</table>
|
46 |
|
47 |
0-shot means that no training data is directly used for prediction through paddlenlp.Taskflow, and 5-shot means that each category contains 5 pieces of labeled data for model fine-tuning. Experiments show that UIE can further improve the performance with a small amount of data (few-shot).
|
48 |
|
49 |
+
> Detailed Info: https://github.com/PaddlePaddle/PaddleNLP/blob/develop/applications/information_extraction/README_en.md
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|