File size: 6,880 Bytes
a4208a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
[
  {
    "text": "No extra costs for access? Asking for a disabled access hack if I want to take my chair (Quickie Ti - weighs little, I can just pick it up and put it in, no need for time-consuming ramps), to the pub here in Wirral jacks up the normal fair by about £1.50.",
    "decoded_text": "No extra costs for access? Asking for a disabled access hack if I want to take my chair (Quickie Ti - weighs little, I can just pick it up and put it in, no need for time-consuming ramps), to the pub here in Wirral jacks up the normal fair by about <unk>1.50.",
    "diff": [
      "replace   text[249:250] --> decoded_text[249:254]      '£' --> '<unk>'"
    ],
    "n_oov_chars": 1,
    "oov_ratio": 0.00392156862745098,
    "oov_charset": "[\"£\"]"
  },
  {
    "text": "and yeah im a boy,and no, im not g*y, im a nice guy. i dont love his songs or anything , but he's not that bad tbh.",
    "decoded_text": "and yeah im a boy,and no, im not g*y, im a nice guy. i dont love his songs or anything, but he's not that bad tbh.",
    "diff": [
      "delete    text[86:87] --> decoded_text[86:86]      ' ' --> ''"
    ],
    "n_oov_chars": 0,
    "oov_ratio": 0.0,
    "oov_charset": "[]"
  },
  {
    "text": "Justin serenaded wonderful or better than a great I like popular songs, particularly as it is talented. all those who hate Justin are g**s because they feel jealous of him because he is handsome at the same time a rising singer and a small age. I myself appreciate the wonderful artist with this beautiful and talented .",
    "decoded_text": "Justin serenaded wonderful or better than a great I like popular songs, particularly as it is talented. all those who hate Justin are g**s because they feel jealous of him because he is handsome at the same time a rising singer and a small age. I myself appreciate the wonderful artist with this beautiful and talented.",
    "diff": [
      "delete    text[318:319] --> decoded_text[318:318]      ' ' --> ''"
    ],
    "n_oov_chars": 0,
    "oov_ratio": 0.0,
    "oov_charset": "[]"
  },
  {
    "text": "Soften the landing zones with a pair of Rubber Mats , made from dyed rubber chips, heat compressed and available in dark green or brick red.",
    "decoded_text": "Soften the landing zones with a pair of Rubber Mats, made from dyed rubber chips, heat compressed and available in dark green or brick red.",
    "diff": [
      "delete    text[51:52] --> decoded_text[51:51]      ' ' --> ''"
    ],
    "n_oov_chars": 0,
    "oov_ratio": 0.0,
    "oov_charset": "[]"
  },
  {
    "text": "​EEI Members have access to a wide range of reports, publications, communications, and other resources. In order to access the resources below, a member log in is required.",
    "decoded_text": "EEI Members have access to a wide range of reports, publications, communications, and other resources. In order to access the resources below, a member log in is required.",
    "diff": [
      "delete    text[0:1] --> decoded_text[0:0] '\\u200b' --> ''"
    ],
    "n_oov_chars": 1,
    "oov_ratio": 0.005813953488372093,
    "oov_charset": "[\"​\"]"
  },
  {
    "text": "​Launched in 2017, AUPSE is a senior executive knowledge exchange and peer-to-peer networking platform created to accelerate operational excellence in the African electric power sector.",
    "decoded_text": "Launched in 2017, AUPSE is a senior executive knowledge exchange and peer-to-peer networking platform created to accelerate operational excellence in the African electric power sector.",
    "diff": [
      "delete    text[0:1] --> decoded_text[0:0] '\\u200b' --> ''"
    ],
    "n_oov_chars": 1,
    "oov_ratio": 0.005405405405405406,
    "oov_charset": "[\"​\"]"
  },
  {
    "text": "Would love some tatts, but too much of a wimp to get them! 😥",
    "decoded_text": "Would love some tatts, but too much of a wimp to get them! <unk>",
    "diff": [
      "replace   text[59:60] --> decoded_text[59:64]      '😥' --> '<unk>'"
    ],
    "n_oov_chars": 1,
    "oov_ratio": 0.016666666666666666,
    "oov_charset": "[\"😥\"]"
  },
  {
    "text": "We're not so rough and over the top these days, so they miiiiight survive ._.",
    "decoded_text": "We're not so rough and over the top these days, so they miiiiight survive._.",
    "diff": [
      "delete    text[73:74] --> decoded_text[73:73]      ' ' --> ''"
    ],
    "n_oov_chars": 0,
    "oov_ratio": 0.0,
    "oov_charset": "[]"
  },
  {
    "text": "Just finished Hulse's \"Black River\" and simply adored the book. So pretty, overall, and much like the Kent Haruf novels, such as \"Plainsong\" that I've enjoyed over the years. \"Black River\" is surely one of the best five I've read this year. Solid Pulitzer choice, in my opinion. Side note: As I've mentioned before, I surely don't understand all of the hoopla surrounding \"The Sellout,\" with so many other worthy contenders. But, what do I know? I'm only a reader. :-) Read on ...",
    "decoded_text": "Just finished Hulse's \"Black River\" and simply adored the book. So pretty, overall, and much like the Kent Haruf novels, such as \"Plainsong\" that I've enjoyed over the years. \"Black River\" is surely one of the best five I've read this year. Solid Pulitzer choice, in my opinion. Side note: As I've mentioned before, I surely don't understand all of the hoopla surrounding \"The Sellout,\" with so many other worthy contenders. But, what do I know? I'm only a reader. :-) Read on...",
    "diff": [
      "replace   text[476:480] --> decoded_text[476:479]   ' ...' --> '...'"
    ],
    "n_oov_chars": 0,
    "oov_ratio": 0.0,
    "oov_charset": "[]"
  },
  {
    "text": "I really don't understand all of the hoopla over THE SELLOUT. Just a so-so book, in my opinion. Minor work. I struggled through it, and can never get back the time spent on that tome. EILEEN and HONEYDEW are sooooooo much better, not to mention THE TURNER HOUSE, TSAR, DID YOU EVER, and others. I'm reading DELICIOUS FOODS right now, and think it's a major-serious contender as well. BLACK RIVER is next on my list, and I can't wait. But, what do I know? :-) Read on ...",
    "decoded_text": "I really don't understand all of the hoopla over THE SELLOUT. Just a so-so book, in my opinion. Minor work. I struggled through it, and can never get back the time spent on that tome. EILEEN and HONEYDEW are sooooooo much better, not to mention THE TURNER HOUSE, TSAR, DID YOU EVER, and others. I'm reading DELICIOUS FOODS right now, and think it's a major-serious contender as well. BLACK RIVER is next on my list, and I can't wait. But, what do I know? :-) Read on...",
    "diff": [
      "replace   text[466:470] --> decoded_text[466:469]   ' ...' --> '...'"
    ],
    "n_oov_chars": 0,
    "oov_ratio": 0.0,
    "oov_charset": "[]"
  }
]