Finetuning on low-confidence data

ECE and DER with x seconds of annotated training data

In the paper we show only a few points of data to make the figures readable : 30, 300 and 1200 seconds. We present here the figures with all their runs. Each point is the average of 3 seeds.

The figure is interactive so that you can zoom in and look at the detail of each point of data.

Reproducibility

The model is trained on subsets of the DIHARD domains. They are composed of multiple regions from all files in the training set, these regions are selected with multiple strategies (that depend either on random sampling and/or the predictions of the model available). We make the selected training regions available as UEM files.

After training, we obtained checkpoints on which we computed DER and ECE. This data is also available: