Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language: no
|
3 |
+
tags:
|
4 |
+
- translation
|
5 |
+
widget:
|
6 |
+
- text: "dette er en liten test som er laget av per egil kummervold han er en forsker som tidligere jobbet ved nasjonalbiblioteket"
|
7 |
+
- text: "detteerenlitentestsomerlagetavperegilkummervoldhanerenforskersomtidligerejobbetvednasjonalbiblioteket"
|
8 |
+
- text: "tirsdag var travel for ukrainas president volodymyr zelenskyj på morgenen tok han imot polens statsminister mateusz morawiecki"
|
9 |
+
license: cc-by-4.0
|
10 |
+
---
|
11 |
+
|
12 |
+
# DeUnCaser
|
13 |
+
The purpose of the DeUnCaser is to fix text that lacks punctation. It is particulary targeted towards the output from Automated Speak Recognition software. In addition to the lack of casing and punctation, it also often lacks pauses between words. Try this demo, and you will understand.
|
14 |
+
|
15 |
+
The DeUnCaser is based on North-T5. It is a sequence-to-sequence mT5 model. It will make an attempt to add punctation, spaces and capitalisation to any text that is thrown at it. It is primarily trained to fix Norwegian text.
|