Jzuluaga
/

bert-base-token-classification-for-atc-en-uwb-atcc

Token Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Jzuluaga commited on Nov 30, 2022

Commit

2095970

·

1 Parent(s): e37fe9f

Update README.md

Files changed (1) hide show

README.md +18 -0

README.md CHANGED Viewed

@@ -58,6 +58,24 @@ model-index:
 # bert-base-token-classification-for-atc-en-uwb-atcc
 This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [UWB-ATCC corpus](https://huggingface.co/datasets/Jzuluaga/uwb_atcc).

 # bert-base-token-classification-for-atc-en-uwb-atcc
+This model allow to detect speaker roles and speaker changes based on text. Normally, this task is done on the acoustic level. However, we propose to perform this task on the text level.
+We solve this challenge by performing speaker role and change detection with a BERT model. We fine-tune it on the chunking task (token-classification).
+For instance:
+- Speaker 1: **lufthansa six two nine charlie tango report when established**
+- Speaker 2: **report when established lufthansa six two nine charlie tango**
+Based on that, could you tell the speaker role? Is it speaker 1 air traffic controller or pilot?
+Also, if you have a recording with 2 or more speakers, like this:
+- Recording with 2 or more segments: **report when established lufthansa six two nine charlie tango lufthansa six two nine charlie tango report when established**
+could you tell when the first speaker ends and when the second starts? This is basically diarization plus speaker role detection.
+Check the inference API (there are3 examples)!
 This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [UWB-ATCC corpus](https://huggingface.co/datasets/Jzuluaga/uwb_atcc).