Alexander Seifert commited on
Commit
17ba05a
1 Parent(s): 3ab6a96

update README

Browse files
Files changed (1) hide show
  1. README.md +30 -10
README.md CHANGED
@@ -10,49 +10,69 @@ app_file: main.py
10
  pinned: true
11
  ---
12
 
13
- # ExplaiNER
14
 
15
- Error Analysis is an important but often overlooked part of the data science project lifecycle, for which there is still very little tooling available. Due to the lack of tooling, practitioners often write throwaway code or, worse, skip understanding their models' errors altogether. This project tries to provide an extensive toolkit to probe any NER model/dataset combination, find labeling errors and understand the models' and datasets' limitations, leading the user on her way to further improvements.
16
 
17
 
18
  ## Sections
19
 
20
- ### Probing
21
 
22
- A very direct and interactive way to test your model is by providing it with a list of text inputs and then inspecting the model outputs. The application features a multiline text field so the user can input multiple texts separated by newlines. For each text, the app will show a data frame containing the tokenized string, token predictions, probabilities and a visual indicator for low probability predictions -- these are the ones you should inspect first for prediction errors.
 
 
 
23
 
24
  ### Embeddings
25
 
26
- For every token in the dataset, we take its hidden state and using TruncatedSVD we project it onto a two-dimensional plane. Data points are colored by label, with mislabeled examples signified by a small black border.
 
 
 
 
 
 
27
 
28
  ### Metrics
29
 
30
  The metrics page contains precision, recall and f-score metrics as well as a confusion matrix over all the classes. By default, the confusion matrix is normalized. There's an option to zero out the diagonal, leaving only prediction errors (here it makes sense to turn off normalization, so you get raw error counts).
31
 
 
32
  ### Misclassified
33
 
34
- asdf
 
35
 
36
  ### Loss by Token/Label
37
 
38
- Shows count, mean and median loss per token and label.
 
39
 
40
  ### Samples by Loss
41
 
42
- Shows every example sorted by loss (descending) for close inspection.
 
43
 
44
  ### Random Samples
45
 
46
- Shows random samples. Simple idea, but often it turns up some interesting things.
 
 
 
 
 
 
47
 
48
  ### Inspect
49
 
50
  Inspect your whole dataset, either unfiltered or by id.
51
 
 
52
  ### Raw data
53
 
54
  See the data as seen by your model.
55
 
 
56
  ### Debug
57
 
58
- Some debug info.
 
10
  pinned: true
11
  ---
12
 
13
+ # 🏷️ ExplaiNER
14
 
15
+ Error Analysis is an important but often overlooked part of the data science project lifecycle, for which there is still very little tooling available. Practitioners tend to write throwaway code or, worse, skip this crucial step of understanding their models' errors altogether. This project tries to provide an extensive toolkit to probe any NER model/dataset combination, find labeling errors and understand the models' and datasets' limitations, leading the user on her way to further improvements.
16
 
17
 
18
  ## Sections
19
 
 
20
 
21
+ ### Activations
22
+
23
+ A group of neurons tend to fire in response to commas and other punctuation. Other groups of neurons tend to fire in response to pronouns. Use this visualization to factorize neuron activity in individual FFNN layers or in the entire model.
24
+
25
 
26
  ### Embeddings
27
 
28
+ For every token in the dataset, we take its hidden state and project it onto a two-dimensional plane. Data points are colored by label/prediction, with mislabeled examples signified by a small black border.
29
+
30
+
31
+ ### Probing
32
+
33
+ A very direct and interactive way to test your model is by providing it with a list of text inputs and then inspecting the model outputs. The application features a multiline text field so the user can input multiple texts separated by newlines. For each text, the app will show a data frame containing the tokenized string, token predictions, probabilities and a visual indicator for low probability predictions -- these are the ones you should inspect first for prediction errors.
34
+
35
 
36
  ### Metrics
37
 
38
  The metrics page contains precision, recall and f-score metrics as well as a confusion matrix over all the classes. By default, the confusion matrix is normalized. There's an option to zero out the diagonal, leaving only prediction errors (here it makes sense to turn off normalization, so you get raw error counts).
39
 
40
+
41
  ### Misclassified
42
 
43
+ This page contains all misclassified examples and allows filtering by specific error types.
44
+
45
 
46
  ### Loss by Token/Label
47
 
48
+ Show count, mean and median loss per token and label.
49
+
50
 
51
  ### Samples by Loss
52
 
53
+ Show every example sorted by loss (descending) for close inspection.
54
+
55
 
56
  ### Random Samples
57
 
58
+ Show random samples. Simple idea, but often it turns up some interesting things.
59
+
60
+
61
+ ### Find Duplicates
62
+
63
+ Find potential duplicates in the data using cosine similarity.
64
+
65
 
66
  ### Inspect
67
 
68
  Inspect your whole dataset, either unfiltered or by id.
69
 
70
+
71
  ### Raw data
72
 
73
  See the data as seen by your model.
74
 
75
+
76
  ### Debug
77
 
78
+ Debug info.