Jzuluaga commited on
Commit
2095970
1 Parent(s): e37fe9f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -58,6 +58,24 @@ model-index:
58
 
59
  # bert-base-token-classification-for-atc-en-uwb-atcc
60
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
 
62
  This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [UWB-ATCC corpus](https://huggingface.co/datasets/Jzuluaga/uwb_atcc).
63
 
 
58
 
59
  # bert-base-token-classification-for-atc-en-uwb-atcc
60
 
61
+ This model allow to detect speaker roles and speaker changes based on text. Normally, this task is done on the acoustic level. However, we propose to perform this task on the text level.
62
+ We solve this challenge by performing speaker role and change detection with a BERT model. We fine-tune it on the chunking task (token-classification).
63
+
64
+ For instance:
65
+
66
+ - Speaker 1: **lufthansa six two nine charlie tango report when established**
67
+ - Speaker 2: **report when established lufthansa six two nine charlie tango**
68
+
69
+ Based on that, could you tell the speaker role? Is it speaker 1 air traffic controller or pilot?
70
+
71
+ Also, if you have a recording with 2 or more speakers, like this:
72
+
73
+ - Recording with 2 or more segments: **report when established lufthansa six two nine charlie tango lufthansa six two nine charlie tango report when established**
74
+
75
+ could you tell when the first speaker ends and when the second starts? This is basically diarization plus speaker role detection.
76
+
77
+ Check the inference API (there are3 examples)!
78
+
79
 
80
  This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [UWB-ATCC corpus](https://huggingface.co/datasets/Jzuluaga/uwb_atcc).
81