hunterhector
commited on
Commit
•
ec0c896
1
Parent(s):
5a77be1
pii
Browse files
common.py
CHANGED
@@ -418,7 +418,7 @@ global_div = Div(
|
|
418 |
Section(
|
419 |
H3("Removing PII"),
|
420 |
P(
|
421 |
-
"
|
422 |
),
|
423 |
P(
|
424 |
"We have used the following regular expressions to identify and replace PII:"
|
|
|
418 |
Section(
|
419 |
H3("Removing PII"),
|
420 |
P(
|
421 |
+
"Similar to prior work, we have removed two types of PII from the dataset: email address and IP address. Regular expressions are used to identify and replace these PII with a generic placeholder. We have also designed PII removal procedures for individual sources, such as replacing names in the Ubuntu IRC dataset mentioned above."
|
422 |
),
|
423 |
P(
|
424 |
"We have used the following regular expressions to identify and replace PII:"
|